{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Holiday Package Prediciton\n",
    "\n",
    "### 1) Problem statement.\n",
    "\"Trips & Travel.Com\" company wants to enable and establish a viable business model to expand the customer base.\n",
    "One of the ways to expand the customer base is to introduce a new offering of packages. Currently, there are 5 types of packages the company is offering * Basic, Standard, Deluxe, Super Deluxe, King. Looking at the data of the last year, we observed that 18% of the customers purchased the packages. However, the marketing cost was quite high because customers were contacted at random without looking at the available information.\n",
    "The company is now planning to launch a new product i.e. Wellness Tourism Package. Wellness Tourism is defined as Travel that allows the traveler to maintain, enhance or kick-start a healthy lifestyle, and support or increase one's sense of well-being.\n",
    "However, this time company wants to harness the available data of existing and potential customers to make the marketing expenditure more efficient.\n",
    "### 2) Data Collection.\n",
    "The Dataset is collected from https://www.kaggle.com/datasets/susant4learning/holiday-package-purchase-prediction\n",
    "The data consists of 20 column and 4888 rows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "## importing important libraries\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "import plotly.express as px\n",
    "import warnings\n",
    "\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CustomerID</th>\n",
       "      <th>ProdTaken</th>\n",
       "      <th>Age</th>\n",
       "      <th>TypeofContact</th>\n",
       "      <th>CityTier</th>\n",
       "      <th>DurationOfPitch</th>\n",
       "      <th>Occupation</th>\n",
       "      <th>Gender</th>\n",
       "      <th>NumberOfPersonVisiting</th>\n",
       "      <th>NumberOfFollowups</th>\n",
       "      <th>ProductPitched</th>\n",
       "      <th>PreferredPropertyStar</th>\n",
       "      <th>MaritalStatus</th>\n",
       "      <th>NumberOfTrips</th>\n",
       "      <th>Passport</th>\n",
       "      <th>PitchSatisfactionScore</th>\n",
       "      <th>OwnCar</th>\n",
       "      <th>NumberOfChildrenVisiting</th>\n",
       "      <th>Designation</th>\n",
       "      <th>MonthlyIncome</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>200000</td>\n",
       "      <td>1</td>\n",
       "      <td>41.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Single</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20993.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>200001</td>\n",
       "      <td>0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>14.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Male</td>\n",
       "      <td>3</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>2.0</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20130.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>200002</td>\n",
       "      <td>1</td>\n",
       "      <td>37.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Free Lancer</td>\n",
       "      <td>Male</td>\n",
       "      <td>3</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Single</td>\n",
       "      <td>7.0</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17090.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>200003</td>\n",
       "      <td>0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>9.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>2</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>1.0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17909.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>200004</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Small Business</td>\n",
       "      <td>Male</td>\n",
       "      <td>2</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>18468.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   CustomerID  ProdTaken   Age    TypeofContact  CityTier  DurationOfPitch  \\\n",
       "0      200000          1  41.0     Self Enquiry         3              6.0   \n",
       "1      200001          0  49.0  Company Invited         1             14.0   \n",
       "2      200002          1  37.0     Self Enquiry         1              8.0   \n",
       "3      200003          0  33.0  Company Invited         1              9.0   \n",
       "4      200004          0   NaN     Self Enquiry         1              8.0   \n",
       "\n",
       "       Occupation  Gender  NumberOfPersonVisiting  NumberOfFollowups  \\\n",
       "0        Salaried  Female                       3                3.0   \n",
       "1        Salaried    Male                       3                4.0   \n",
       "2     Free Lancer    Male                       3                4.0   \n",
       "3        Salaried  Female                       2                3.0   \n",
       "4  Small Business    Male                       2                3.0   \n",
       "\n",
       "  ProductPitched  PreferredPropertyStar MaritalStatus  NumberOfTrips  \\\n",
       "0         Deluxe                    3.0        Single            1.0   \n",
       "1         Deluxe                    4.0      Divorced            2.0   \n",
       "2          Basic                    3.0        Single            7.0   \n",
       "3          Basic                    3.0      Divorced            2.0   \n",
       "4          Basic                    4.0      Divorced            1.0   \n",
       "\n",
       "   Passport  PitchSatisfactionScore  OwnCar  NumberOfChildrenVisiting  \\\n",
       "0         1                       2       1                       0.0   \n",
       "1         0                       3       1                       2.0   \n",
       "2         1                       3       0                       0.0   \n",
       "3         1                       5       1                       1.0   \n",
       "4         0                       5       1                       0.0   \n",
       "\n",
       "  Designation  MonthlyIncome  \n",
       "0     Manager        20993.0  \n",
       "1     Manager        20130.0  \n",
       "2   Executive        17090.0  \n",
       "3   Executive        17909.0  \n",
       "4   Executive        18468.0  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv(\"Travel.csv\")\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Cleaning\n",
    "### Handling Missing values\n",
    "1. Handling Missing values\n",
    "2. Handling Duplicates\n",
    "3. Check data type\n",
    "4. Understand the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CustomerID                    0\n",
       "ProdTaken                     0\n",
       "Age                         226\n",
       "TypeofContact                25\n",
       "CityTier                      0\n",
       "DurationOfPitch             251\n",
       "Occupation                    0\n",
       "Gender                        0\n",
       "NumberOfPersonVisiting        0\n",
       "NumberOfFollowups            45\n",
       "ProductPitched                0\n",
       "PreferredPropertyStar        26\n",
       "MaritalStatus                 0\n",
       "NumberOfTrips               140\n",
       "Passport                      0\n",
       "PitchSatisfactionScore        0\n",
       "OwnCar                        0\n",
       "NumberOfChildrenVisiting     66\n",
       "Designation                   0\n",
       "MonthlyIncome               233\n",
       "dtype: int64"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.isnull().sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Male       2916\n",
       "Female     1817\n",
       "Fe Male     155\n",
       "Name: Gender, dtype: int64"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### Check all the categories \n",
    "df['Gender'].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Married      2340\n",
       "Divorced      950\n",
       "Single        916\n",
       "Unmarried     682\n",
       "Name: MaritalStatus, dtype: int64"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['MaritalStatus'].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Self Enquiry       3444\n",
       "Company Invited    1419\n",
       "Name: TypeofContact, dtype: int64"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['TypeofContact'].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "df['Gender'] = df['Gender'].replace('Fe Male', 'Female')\n",
    "df['MaritalStatus'] = df['MaritalStatus'].replace('Single', 'Unmarried')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Male      2916\n",
       "Female    1972\n",
       "Name: Gender, dtype: int64"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "### Check all the categories \n",
    "df['Gender'].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>CustomerID</th>\n",
       "      <th>ProdTaken</th>\n",
       "      <th>Age</th>\n",
       "      <th>TypeofContact</th>\n",
       "      <th>CityTier</th>\n",
       "      <th>DurationOfPitch</th>\n",
       "      <th>Occupation</th>\n",
       "      <th>Gender</th>\n",
       "      <th>NumberOfPersonVisiting</th>\n",
       "      <th>NumberOfFollowups</th>\n",
       "      <th>ProductPitched</th>\n",
       "      <th>PreferredPropertyStar</th>\n",
       "      <th>MaritalStatus</th>\n",
       "      <th>NumberOfTrips</th>\n",
       "      <th>Passport</th>\n",
       "      <th>PitchSatisfactionScore</th>\n",
       "      <th>OwnCar</th>\n",
       "      <th>NumberOfChildrenVisiting</th>\n",
       "      <th>Designation</th>\n",
       "      <th>MonthlyIncome</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>200000</td>\n",
       "      <td>1</td>\n",
       "      <td>41.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20993.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>200001</td>\n",
       "      <td>0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>14.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Male</td>\n",
       "      <td>3</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>2.0</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20130.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>200002</td>\n",
       "      <td>1</td>\n",
       "      <td>37.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Free Lancer</td>\n",
       "      <td>Male</td>\n",
       "      <td>3</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>7.0</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17090.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>200003</td>\n",
       "      <td>0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>9.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>2</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>1.0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17909.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>200004</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Small Business</td>\n",
       "      <td>Male</td>\n",
       "      <td>2</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>18468.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   CustomerID  ProdTaken   Age    TypeofContact  CityTier  DurationOfPitch  \\\n",
       "0      200000          1  41.0     Self Enquiry         3              6.0   \n",
       "1      200001          0  49.0  Company Invited         1             14.0   \n",
       "2      200002          1  37.0     Self Enquiry         1              8.0   \n",
       "3      200003          0  33.0  Company Invited         1              9.0   \n",
       "4      200004          0   NaN     Self Enquiry         1              8.0   \n",
       "\n",
       "       Occupation  Gender  NumberOfPersonVisiting  NumberOfFollowups  \\\n",
       "0        Salaried  Female                       3                3.0   \n",
       "1        Salaried    Male                       3                4.0   \n",
       "2     Free Lancer    Male                       3                4.0   \n",
       "3        Salaried  Female                       2                3.0   \n",
       "4  Small Business    Male                       2                3.0   \n",
       "\n",
       "  ProductPitched  PreferredPropertyStar MaritalStatus  NumberOfTrips  \\\n",
       "0         Deluxe                    3.0     Unmarried            1.0   \n",
       "1         Deluxe                    4.0      Divorced            2.0   \n",
       "2          Basic                    3.0     Unmarried            7.0   \n",
       "3          Basic                    3.0      Divorced            2.0   \n",
       "4          Basic                    4.0      Divorced            1.0   \n",
       "\n",
       "   Passport  PitchSatisfactionScore  OwnCar  NumberOfChildrenVisiting  \\\n",
       "0         1                       2       1                       0.0   \n",
       "1         0                       3       1                       2.0   \n",
       "2         1                       3       0                       0.0   \n",
       "3         1                       5       1                       1.0   \n",
       "4         0                       5       1                       0.0   \n",
       "\n",
       "  Designation  MonthlyIncome  \n",
       "0     Manager        20993.0  \n",
       "1     Manager        20130.0  \n",
       "2   Executive        17090.0  \n",
       "3   Executive        17909.0  \n",
       "4   Executive        18468.0  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Age 4.62357 % missing values\n",
      "TypeofContact 0.51146 % missing values\n",
      "DurationOfPitch 5.13502 % missing values\n",
      "NumberOfFollowups 0.92062 % missing values\n",
      "PreferredPropertyStar 0.53191 % missing values\n",
      "NumberOfTrips 2.86416 % missing values\n",
      "NumberOfChildrenVisiting 1.35025 % missing values\n",
      "MonthlyIncome 4.76678 % missing values\n"
     ]
    }
   ],
   "source": [
    "## Check Misssing Values\n",
    "##these are the features with nan value\n",
    "features_with_na=[features for features in df.columns if df[features].isnull().sum()>=1]\n",
    "for feature in features_with_na:\n",
    "    print(feature,np.round(df[feature].isnull().mean()*100,5), '% missing values')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Age</th>\n",
       "      <th>DurationOfPitch</th>\n",
       "      <th>NumberOfFollowups</th>\n",
       "      <th>PreferredPropertyStar</th>\n",
       "      <th>NumberOfTrips</th>\n",
       "      <th>NumberOfChildrenVisiting</th>\n",
       "      <th>MonthlyIncome</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>4662.000000</td>\n",
       "      <td>4637.000000</td>\n",
       "      <td>4843.000000</td>\n",
       "      <td>4862.000000</td>\n",
       "      <td>4748.000000</td>\n",
       "      <td>4822.000000</td>\n",
       "      <td>4655.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>37.622265</td>\n",
       "      <td>15.490835</td>\n",
       "      <td>3.708445</td>\n",
       "      <td>3.581037</td>\n",
       "      <td>3.236521</td>\n",
       "      <td>1.187267</td>\n",
       "      <td>23619.853491</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>9.316387</td>\n",
       "      <td>8.519643</td>\n",
       "      <td>1.002509</td>\n",
       "      <td>0.798009</td>\n",
       "      <td>1.849019</td>\n",
       "      <td>0.857861</td>\n",
       "      <td>5380.698361</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>18.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1000.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>31.000000</td>\n",
       "      <td>9.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>20346.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>36.000000</td>\n",
       "      <td>13.000000</td>\n",
       "      <td>4.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>22347.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>44.000000</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>4.000000</td>\n",
       "      <td>4.000000</td>\n",
       "      <td>4.000000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>25571.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>61.000000</td>\n",
       "      <td>127.000000</td>\n",
       "      <td>6.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>22.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>98678.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               Age  DurationOfPitch  NumberOfFollowups  PreferredPropertyStar  \\\n",
       "count  4662.000000      4637.000000        4843.000000            4862.000000   \n",
       "mean     37.622265        15.490835           3.708445               3.581037   \n",
       "std       9.316387         8.519643           1.002509               0.798009   \n",
       "min      18.000000         5.000000           1.000000               3.000000   \n",
       "25%      31.000000         9.000000           3.000000               3.000000   \n",
       "50%      36.000000        13.000000           4.000000               3.000000   \n",
       "75%      44.000000        20.000000           4.000000               4.000000   \n",
       "max      61.000000       127.000000           6.000000               5.000000   \n",
       "\n",
       "       NumberOfTrips  NumberOfChildrenVisiting  MonthlyIncome  \n",
       "count    4748.000000               4822.000000    4655.000000  \n",
       "mean        3.236521                  1.187267   23619.853491  \n",
       "std         1.849019                  0.857861    5380.698361  \n",
       "min         1.000000                  0.000000    1000.000000  \n",
       "25%         2.000000                  1.000000   20346.000000  \n",
       "50%         3.000000                  1.000000   22347.000000  \n",
       "75%         4.000000                  2.000000   25571.000000  \n",
       "max        22.000000                  3.000000   98678.000000  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# statistics on numerical columns (Null cols)\n",
    "df[features_with_na].select_dtypes(exclude='object').describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Imputing Null values\n",
    "1. Impute Median value for Age column\n",
    "2. Impute Mode for Type of Contract\n",
    "3. Impute Median for Duration of Pitch\n",
    "4. Impute Mode for NumberofFollowup as it is Discrete feature\n",
    "5. Impute Mode for PreferredPropertyStar\n",
    "6. Impute Median for NumberofTrips\n",
    "7. Impute Mode for NumberOfChildrenVisiting\n",
    "8. Impute Median for MonthlyIncome"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Age\n",
    "df.Age.fillna(df.Age.median(), inplace=True)\n",
    "\n",
    "#TypeofContract\n",
    "df.TypeofContact.fillna(df.TypeofContact.mode()[0], inplace=True)\n",
    "\n",
    "#DurationOfPitch\n",
    "df.DurationOfPitch.fillna(df.DurationOfPitch.median(), inplace=True)\n",
    "\n",
    "#NumberOfFollowups\n",
    "df.NumberOfFollowups.fillna(df.NumberOfFollowups.mode()[0], inplace=True)\n",
    "\n",
    "#PreferredPropertyStar\n",
    "df.PreferredPropertyStar.fillna(df.PreferredPropertyStar.mode()[0], inplace=True)\n",
    "\n",
    "#NumberOfTrips\n",
    "df.NumberOfTrips.fillna(df.NumberOfTrips.median(), inplace=True)\n",
    "\n",
    "#NumberOfChildrenVisiting\n",
    "df.NumberOfChildrenVisiting.fillna(df.NumberOfChildrenVisiting.mode()[0], inplace=True)\n",
    "\n",
    "#MonthlyIncome\n",
    "df.MonthlyIncome.fillna(df.MonthlyIncome.median(), inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CustomerID                  0\n",
       "ProdTaken                   0\n",
       "Age                         0\n",
       "TypeofContact               0\n",
       "CityTier                    0\n",
       "DurationOfPitch             0\n",
       "Occupation                  0\n",
       "Gender                      0\n",
       "NumberOfPersonVisiting      0\n",
       "NumberOfFollowups           0\n",
       "ProductPitched              0\n",
       "PreferredPropertyStar       0\n",
       "MaritalStatus               0\n",
       "NumberOfTrips               0\n",
       "Passport                    0\n",
       "PitchSatisfactionScore      0\n",
       "OwnCar                      0\n",
       "NumberOfChildrenVisiting    0\n",
       "Designation                 0\n",
       "MonthlyIncome               0\n",
       "dtype: int64"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()\n",
    "df.isnull().sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.drop('CustomerID', inplace=True, axis=1)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Feature Engineering\n",
    "\n",
    "### Feature Extraction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ProdTaken</th>\n",
       "      <th>Age</th>\n",
       "      <th>TypeofContact</th>\n",
       "      <th>CityTier</th>\n",
       "      <th>DurationOfPitch</th>\n",
       "      <th>Occupation</th>\n",
       "      <th>Gender</th>\n",
       "      <th>NumberOfFollowups</th>\n",
       "      <th>ProductPitched</th>\n",
       "      <th>PreferredPropertyStar</th>\n",
       "      <th>MaritalStatus</th>\n",
       "      <th>NumberOfTrips</th>\n",
       "      <th>Passport</th>\n",
       "      <th>PitchSatisfactionScore</th>\n",
       "      <th>OwnCar</th>\n",
       "      <th>Designation</th>\n",
       "      <th>MonthlyIncome</th>\n",
       "      <th>TotalVisiting</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>41.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20993.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>14.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Male</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20130.0</td>\n",
       "      <td>5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>37.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Free Lancer</td>\n",
       "      <td>Male</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>7.0</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17090.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>9.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17909.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>36.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Small Business</td>\n",
       "      <td>Male</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>Executive</td>\n",
       "      <td>18468.0</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   ProdTaken   Age    TypeofContact  CityTier  DurationOfPitch  \\\n",
       "0          1  41.0     Self Enquiry         3              6.0   \n",
       "1          0  49.0  Company Invited         1             14.0   \n",
       "2          1  37.0     Self Enquiry         1              8.0   \n",
       "3          0  33.0  Company Invited         1              9.0   \n",
       "4          0  36.0     Self Enquiry         1              8.0   \n",
       "\n",
       "       Occupation  Gender  NumberOfFollowups ProductPitched  \\\n",
       "0        Salaried  Female                3.0         Deluxe   \n",
       "1        Salaried    Male                4.0         Deluxe   \n",
       "2     Free Lancer    Male                4.0          Basic   \n",
       "3        Salaried  Female                3.0          Basic   \n",
       "4  Small Business    Male                3.0          Basic   \n",
       "\n",
       "   PreferredPropertyStar MaritalStatus  NumberOfTrips  Passport  \\\n",
       "0                    3.0     Unmarried            1.0         1   \n",
       "1                    4.0      Divorced            2.0         0   \n",
       "2                    3.0     Unmarried            7.0         1   \n",
       "3                    3.0      Divorced            2.0         1   \n",
       "4                    4.0      Divorced            1.0         0   \n",
       "\n",
       "   PitchSatisfactionScore  OwnCar Designation  MonthlyIncome  TotalVisiting  \n",
       "0                       2       1     Manager        20993.0            3.0  \n",
       "1                       3       1     Manager        20130.0            5.0  \n",
       "2                       3       0   Executive        17090.0            3.0  \n",
       "3                       5       1   Executive        17909.0            3.0  \n",
       "4                       5       1   Executive        18468.0            2.0  "
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create new column for feature\n",
    "df['TotalVisiting'] = df['NumberOfPersonVisiting'] + df['NumberOfChildrenVisiting']\n",
    "df.drop(columns=['NumberOfPersonVisiting', 'NumberOfChildrenVisiting'], axis=1, inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Num of Numerical Features : 12\n"
     ]
    }
   ],
   "source": [
    "## get all the numeric features\n",
    "num_features = [feature for feature in df.columns if df[feature].dtype != 'O']\n",
    "print('Num of Numerical Features :', len(num_features))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Num of Categorical Features : 6\n"
     ]
    }
   ],
   "source": [
    "##categorical features\n",
    "cat_features = [feature for feature in df.columns if df[feature].dtype == 'O']\n",
    "print('Num of Categorical Features :', len(cat_features))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Num of Discrete Features : 9\n"
     ]
    }
   ],
   "source": [
    "## Discrete features\n",
    "discrete_features=[feature for feature in num_features if len(df[feature].unique())<=25]\n",
    "print('Num of Discrete Features :',len(discrete_features))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Num of Continuous Features : 3\n"
     ]
    }
   ],
   "source": [
    "## coontinuous features\n",
    "continuous_features=[feature for feature in num_features if feature not in discrete_features]\n",
    "print('Num of Continuous Features :',len(continuous_features))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ProdTaken</th>\n",
       "      <th>Age</th>\n",
       "      <th>TypeofContact</th>\n",
       "      <th>CityTier</th>\n",
       "      <th>DurationOfPitch</th>\n",
       "      <th>Occupation</th>\n",
       "      <th>Gender</th>\n",
       "      <th>NumberOfFollowups</th>\n",
       "      <th>ProductPitched</th>\n",
       "      <th>PreferredPropertyStar</th>\n",
       "      <th>MaritalStatus</th>\n",
       "      <th>NumberOfTrips</th>\n",
       "      <th>Passport</th>\n",
       "      <th>PitchSatisfactionScore</th>\n",
       "      <th>OwnCar</th>\n",
       "      <th>Designation</th>\n",
       "      <th>MonthlyIncome</th>\n",
       "      <th>TotalVisiting</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>41.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20993.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>49.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>14.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Male</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20130.0</td>\n",
       "      <td>5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>37.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Free Lancer</td>\n",
       "      <td>Male</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>7.0</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17090.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>9.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17909.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>36.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Small Business</td>\n",
       "      <td>Male</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>Executive</td>\n",
       "      <td>18468.0</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   ProdTaken   Age    TypeofContact  CityTier  DurationOfPitch  \\\n",
       "0          1  41.0     Self Enquiry         3              6.0   \n",
       "1          0  49.0  Company Invited         1             14.0   \n",
       "2          1  37.0     Self Enquiry         1              8.0   \n",
       "3          0  33.0  Company Invited         1              9.0   \n",
       "4          0  36.0     Self Enquiry         1              8.0   \n",
       "\n",
       "       Occupation  Gender  NumberOfFollowups ProductPitched  \\\n",
       "0        Salaried  Female                3.0         Deluxe   \n",
       "1        Salaried    Male                4.0         Deluxe   \n",
       "2     Free Lancer    Male                4.0          Basic   \n",
       "3        Salaried  Female                3.0          Basic   \n",
       "4  Small Business    Male                3.0          Basic   \n",
       "\n",
       "   PreferredPropertyStar MaritalStatus  NumberOfTrips  Passport  \\\n",
       "0                    3.0     Unmarried            1.0         1   \n",
       "1                    4.0      Divorced            2.0         0   \n",
       "2                    3.0     Unmarried            7.0         1   \n",
       "3                    3.0      Divorced            2.0         1   \n",
       "4                    4.0      Divorced            1.0         0   \n",
       "\n",
       "   PitchSatisfactionScore  OwnCar Designation  MonthlyIncome  TotalVisiting  \n",
       "0                       2       1     Manager        20993.0            3.0  \n",
       "1                       3       1     Manager        20130.0            5.0  \n",
       "2                       3       0   Executive        17090.0            3.0  \n",
       "3                       5       1   Executive        17909.0            3.0  \n",
       "4                       5       1   Executive        18468.0            2.0  "
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train Test Split And Model Training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "X = df.drop(['ProdTaken'], axis=1)\n",
    "y = df['ProdTaken']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Age</th>\n",
       "      <th>TypeofContact</th>\n",
       "      <th>CityTier</th>\n",
       "      <th>DurationOfPitch</th>\n",
       "      <th>Occupation</th>\n",
       "      <th>Gender</th>\n",
       "      <th>NumberOfFollowups</th>\n",
       "      <th>ProductPitched</th>\n",
       "      <th>PreferredPropertyStar</th>\n",
       "      <th>MaritalStatus</th>\n",
       "      <th>NumberOfTrips</th>\n",
       "      <th>Passport</th>\n",
       "      <th>PitchSatisfactionScore</th>\n",
       "      <th>OwnCar</th>\n",
       "      <th>Designation</th>\n",
       "      <th>MonthlyIncome</th>\n",
       "      <th>TotalVisiting</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>41.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20993.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>49.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>14.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Male</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20130.0</td>\n",
       "      <td>5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>37.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Free Lancer</td>\n",
       "      <td>Male</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>7.0</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17090.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>33.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>9.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17909.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>36.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Small Business</td>\n",
       "      <td>Male</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>Executive</td>\n",
       "      <td>18468.0</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Age    TypeofContact  CityTier  DurationOfPitch      Occupation  Gender  \\\n",
       "0  41.0     Self Enquiry         3              6.0        Salaried  Female   \n",
       "1  49.0  Company Invited         1             14.0        Salaried    Male   \n",
       "2  37.0     Self Enquiry         1              8.0     Free Lancer    Male   \n",
       "3  33.0  Company Invited         1              9.0        Salaried  Female   \n",
       "4  36.0     Self Enquiry         1              8.0  Small Business    Male   \n",
       "\n",
       "   NumberOfFollowups ProductPitched  PreferredPropertyStar MaritalStatus  \\\n",
       "0                3.0         Deluxe                    3.0     Unmarried   \n",
       "1                4.0         Deluxe                    4.0      Divorced   \n",
       "2                4.0          Basic                    3.0     Unmarried   \n",
       "3                3.0          Basic                    3.0      Divorced   \n",
       "4                3.0          Basic                    4.0      Divorced   \n",
       "\n",
       "   NumberOfTrips  Passport  PitchSatisfactionScore  OwnCar Designation  \\\n",
       "0            1.0         1                       2       1     Manager   \n",
       "1            2.0         0                       3       1     Manager   \n",
       "2            7.0         1                       3       0   Executive   \n",
       "3            2.0         1                       5       1   Executive   \n",
       "4            1.0         0                       5       1   Executive   \n",
       "\n",
       "   MonthlyIncome  TotalVisiting  \n",
       "0        20993.0            3.0  \n",
       "1        20130.0            5.0  \n",
       "2        17090.0            3.0  \n",
       "3        17909.0            3.0  \n",
       "4        18468.0            2.0  "
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    3968\n",
       "1     920\n",
       "Name: ProdTaken, dtype: int64"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y.value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Age</th>\n",
       "      <th>TypeofContact</th>\n",
       "      <th>CityTier</th>\n",
       "      <th>DurationOfPitch</th>\n",
       "      <th>Occupation</th>\n",
       "      <th>Gender</th>\n",
       "      <th>NumberOfFollowups</th>\n",
       "      <th>ProductPitched</th>\n",
       "      <th>PreferredPropertyStar</th>\n",
       "      <th>MaritalStatus</th>\n",
       "      <th>NumberOfTrips</th>\n",
       "      <th>Passport</th>\n",
       "      <th>PitchSatisfactionScore</th>\n",
       "      <th>OwnCar</th>\n",
       "      <th>Designation</th>\n",
       "      <th>MonthlyIncome</th>\n",
       "      <th>TotalVisiting</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>41.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>3</td>\n",
       "      <td>6.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20993.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>49.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>14.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Male</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Deluxe</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>Manager</td>\n",
       "      <td>20130.0</td>\n",
       "      <td>5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>37.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Free Lancer</td>\n",
       "      <td>Male</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>7.0</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17090.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>33.0</td>\n",
       "      <td>Company Invited</td>\n",
       "      <td>1</td>\n",
       "      <td>9.0</td>\n",
       "      <td>Salaried</td>\n",
       "      <td>Female</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>Executive</td>\n",
       "      <td>17909.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>36.0</td>\n",
       "      <td>Self Enquiry</td>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>Small Business</td>\n",
       "      <td>Male</td>\n",
       "      <td>3.0</td>\n",
       "      <td>Basic</td>\n",
       "      <td>4.0</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "      <td>Executive</td>\n",
       "      <td>18468.0</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Age    TypeofContact  CityTier  DurationOfPitch      Occupation  Gender  \\\n",
       "0  41.0     Self Enquiry         3              6.0        Salaried  Female   \n",
       "1  49.0  Company Invited         1             14.0        Salaried    Male   \n",
       "2  37.0     Self Enquiry         1              8.0     Free Lancer    Male   \n",
       "3  33.0  Company Invited         1              9.0        Salaried  Female   \n",
       "4  36.0     Self Enquiry         1              8.0  Small Business    Male   \n",
       "\n",
       "   NumberOfFollowups ProductPitched  PreferredPropertyStar MaritalStatus  \\\n",
       "0                3.0         Deluxe                    3.0     Unmarried   \n",
       "1                4.0         Deluxe                    4.0      Divorced   \n",
       "2                4.0          Basic                    3.0     Unmarried   \n",
       "3                3.0          Basic                    3.0      Divorced   \n",
       "4                3.0          Basic                    4.0      Divorced   \n",
       "\n",
       "   NumberOfTrips  Passport  PitchSatisfactionScore  OwnCar Designation  \\\n",
       "0            1.0         1                       2       1     Manager   \n",
       "1            2.0         0                       3       1     Manager   \n",
       "2            7.0         1                       3       0   Executive   \n",
       "3            2.0         1                       5       1   Executive   \n",
       "4            1.0         0                       5       1   Executive   \n",
       "\n",
       "   MonthlyIncome  TotalVisiting  \n",
       "0        20993.0            3.0  \n",
       "1        20130.0            5.0  \n",
       "2        17090.0            3.0  \n",
       "3        17909.0            3.0  \n",
       "4        18468.0            2.0  "
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((3910, 17), (978, 17))"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# separate dataset into train and test\n",
    "X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=42)\n",
    "X_train.shape, X_test.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 4888 entries, 0 to 4887\n",
      "Data columns (total 17 columns):\n",
      " #   Column                  Non-Null Count  Dtype  \n",
      "---  ------                  --------------  -----  \n",
      " 0   Age                     4888 non-null   float64\n",
      " 1   TypeofContact           4888 non-null   object \n",
      " 2   CityTier                4888 non-null   int64  \n",
      " 3   DurationOfPitch         4888 non-null   float64\n",
      " 4   Occupation              4888 non-null   object \n",
      " 5   Gender                  4888 non-null   object \n",
      " 6   NumberOfFollowups       4888 non-null   float64\n",
      " 7   ProductPitched          4888 non-null   object \n",
      " 8   PreferredPropertyStar   4888 non-null   float64\n",
      " 9   MaritalStatus           4888 non-null   object \n",
      " 10  NumberOfTrips           4888 non-null   float64\n",
      " 11  Passport                4888 non-null   int64  \n",
      " 12  PitchSatisfactionScore  4888 non-null   int64  \n",
      " 13  OwnCar                  4888 non-null   int64  \n",
      " 14  Designation             4888 non-null   object \n",
      " 15  MonthlyIncome           4888 non-null   float64\n",
      " 16  TotalVisiting           4888 non-null   float64\n",
      "dtypes: float64(7), int64(4), object(6)\n",
      "memory usage: 649.3+ KB\n"
     ]
    }
   ],
   "source": [
    "X.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create Column Transformer with 3 types of transformers\n",
    "cat_features = X.select_dtypes(include=\"object\").columns\n",
    "num_features = X.select_dtypes(exclude=\"object\").columns\n",
    "\n",
    "from sklearn.preprocessing import OneHotEncoder, StandardScaler\n",
    "from sklearn.compose import ColumnTransformer\n",
    "\n",
    "numeric_transformer = StandardScaler()\n",
    "oh_transformer = OneHotEncoder(drop='first')\n",
    "\n",
    "preprocessor = ColumnTransformer(\n",
    "    [\n",
    "         (\"OneHotEncoder\", oh_transformer, cat_features),\n",
    "          (\"StandardScaler\", numeric_transformer, num_features)\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ColumnTransformer(transformers=[('OneHotEncoder', OneHotEncoder(drop='first'),\n",
       "                                 Index(['TypeofContact', 'Occupation', 'Gender', 'ProductPitched',\n",
       "       'MaritalStatus', 'Designation'],\n",
       "      dtype='object')),\n",
       "                                ('StandardScaler', StandardScaler(),\n",
       "                                 Index(['Age', 'CityTier', 'DurationOfPitch', 'NumberOfFollowups',\n",
       "       'PreferredPropertyStar', 'NumberOfTrips', 'Passport',\n",
       "       'PitchSatisfactionScore', 'OwnCar', 'MonthlyIncome', 'TotalVisiting'],\n",
       "      dtype='object'))])"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "preprocessor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [],
   "source": [
    "## applying Trnsformation in training(fit_transform)\n",
    "X_train=preprocessor.fit_transform(X_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "      <th>8</th>\n",
       "      <th>9</th>\n",
       "      <th>...</th>\n",
       "      <th>16</th>\n",
       "      <th>17</th>\n",
       "      <th>18</th>\n",
       "      <th>19</th>\n",
       "      <th>20</th>\n",
       "      <th>21</th>\n",
       "      <th>22</th>\n",
       "      <th>23</th>\n",
       "      <th>24</th>\n",
       "      <th>25</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-1.020350</td>\n",
       "      <td>1.284279</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-0.127737</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>0.679690</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.382245</td>\n",
       "      <td>-0.774151</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>0.690023</td>\n",
       "      <td>0.282777</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>1.511598</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>0.679690</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.459799</td>\n",
       "      <td>0.643615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-1.020350</td>\n",
       "      <td>0.282777</td>\n",
       "      <td>1.771041</td>\n",
       "      <td>0.418708</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>0.679690</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.245196</td>\n",
       "      <td>-0.065268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-1.020350</td>\n",
       "      <td>1.284279</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-0.127737</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>1.408395</td>\n",
       "      <td>-1.277194</td>\n",
       "      <td>0.213475</td>\n",
       "      <td>-0.065268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>2.400396</td>\n",
       "      <td>-1.720227</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>1.511598</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>-0.049015</td>\n",
       "      <td>-1.277194</td>\n",
       "      <td>-0.024889</td>\n",
       "      <td>2.061382</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3905</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-0.653841</td>\n",
       "      <td>1.284279</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-0.674182</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>-1.506426</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.536973</td>\n",
       "      <td>0.643615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3906</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.455047</td>\n",
       "      <td>-0.898180</td>\n",
       "      <td>-0.718725</td>\n",
       "      <td>1.771041</td>\n",
       "      <td>-1.220627</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>1.408395</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>1.529609</td>\n",
       "      <td>-0.065268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3907</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.455047</td>\n",
       "      <td>1.545210</td>\n",
       "      <td>0.282777</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>2.058043</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>-0.777720</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.360576</td>\n",
       "      <td>0.643615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3908</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.455047</td>\n",
       "      <td>1.789549</td>\n",
       "      <td>1.284279</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-0.127737</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>-1.506426</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.252799</td>\n",
       "      <td>0.643615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3909</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-0.776011</td>\n",
       "      <td>0.282777</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-1.220627</td>\n",
       "      <td>1.581280</td>\n",
       "      <td>-0.049015</td>\n",
       "      <td>-1.277194</td>\n",
       "      <td>-1.082511</td>\n",
       "      <td>-1.483035</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3910 rows × 26 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       0    1    2    3    4    5    6    7    8    9   ...        16  \\\n",
       "0     1.0  0.0  0.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "1     1.0  0.0  1.0  0.0  1.0  0.0  0.0  0.0  0.0  1.0  ... -0.721400   \n",
       "2     1.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "3     1.0  0.0  1.0  0.0  1.0  1.0  0.0  0.0  0.0  1.0  ... -0.721400   \n",
       "4     0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "...   ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...       ...   \n",
       "3905  1.0  0.0  0.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "3906  1.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0  ...  1.455047   \n",
       "3907  0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  1.455047   \n",
       "3908  1.0  0.0  0.0  1.0  0.0  1.0  0.0  0.0  0.0  1.0  ...  1.455047   \n",
       "3909  0.0  0.0  1.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "\n",
       "            17        18        19        20        21        22        23  \\\n",
       "0    -1.020350  1.284279 -0.725271 -0.127737 -0.632399  0.679690  0.782966   \n",
       "1     0.690023  0.282777 -0.725271  1.511598 -0.632399  0.679690  0.782966   \n",
       "2    -1.020350  0.282777  1.771041  0.418708 -0.632399  0.679690  0.782966   \n",
       "3    -1.020350  1.284279 -0.725271 -0.127737 -0.632399  1.408395 -1.277194   \n",
       "4     2.400396 -1.720227 -0.725271  1.511598 -0.632399 -0.049015 -1.277194   \n",
       "...        ...       ...       ...       ...       ...       ...       ...   \n",
       "3905 -0.653841  1.284279 -0.725271 -0.674182 -0.632399 -1.506426  0.782966   \n",
       "3906 -0.898180 -0.718725  1.771041 -1.220627 -0.632399  1.408395  0.782966   \n",
       "3907  1.545210  0.282777 -0.725271  2.058043 -0.632399 -0.777720  0.782966   \n",
       "3908  1.789549  1.284279 -0.725271 -0.127737 -0.632399 -1.506426  0.782966   \n",
       "3909 -0.776011  0.282777 -0.725271 -1.220627  1.581280 -0.049015 -1.277194   \n",
       "\n",
       "            24        25  \n",
       "0    -0.382245 -0.774151  \n",
       "1    -0.459799  0.643615  \n",
       "2    -0.245196 -0.065268  \n",
       "3     0.213475 -0.065268  \n",
       "4    -0.024889  2.061382  \n",
       "...        ...       ...  \n",
       "3905 -0.536973  0.643615  \n",
       "3906  1.529609 -0.065268  \n",
       "3907 -0.360576  0.643615  \n",
       "3908 -0.252799  0.643615  \n",
       "3909 -1.082511 -1.483035  \n",
       "\n",
       "[3910 rows x 26 columns]"
      ]
     },
     "execution_count": 61,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(X_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [],
   "source": [
    "## apply tansformation on test(transform)\n",
    "X_test=preprocessor.transform(X_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.        ,  0.        ,  0.        , ..., -1.2771941 ,\n",
       "        -0.73751038, -0.77415132],\n",
       "       [ 1.        ,  0.        ,  0.        , ..., -1.2771941 ,\n",
       "        -0.6704111 , -0.06526803],\n",
       "       [ 1.        ,  0.        ,  0.        , ...,  0.78296635,\n",
       "        -0.4208322 , -0.77415132],\n",
       "       ...,\n",
       "       [ 0.        ,  1.        ,  0.        , ...,  0.78296635,\n",
       "         0.69001249,  0.64361526],\n",
       "       [ 1.        ,  0.        ,  0.        , ...,  0.78296635,\n",
       "        -0.22827818, -0.77415132],\n",
       "       [ 1.        ,  1.        ,  0.        , ...,  0.78296635,\n",
       "        -0.44611323,  2.06138184]])"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_test"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Random Forest Classifier Training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "      <th>7</th>\n",
       "      <th>8</th>\n",
       "      <th>9</th>\n",
       "      <th>...</th>\n",
       "      <th>16</th>\n",
       "      <th>17</th>\n",
       "      <th>18</th>\n",
       "      <th>19</th>\n",
       "      <th>20</th>\n",
       "      <th>21</th>\n",
       "      <th>22</th>\n",
       "      <th>23</th>\n",
       "      <th>24</th>\n",
       "      <th>25</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-1.020350</td>\n",
       "      <td>1.284279</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-0.127737</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>0.679690</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.382245</td>\n",
       "      <td>-0.774151</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>0.690023</td>\n",
       "      <td>0.282777</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>1.511598</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>0.679690</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.459799</td>\n",
       "      <td>0.643615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-1.020350</td>\n",
       "      <td>0.282777</td>\n",
       "      <td>1.771041</td>\n",
       "      <td>0.418708</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>0.679690</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.245196</td>\n",
       "      <td>-0.065268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-1.020350</td>\n",
       "      <td>1.284279</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-0.127737</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>1.408395</td>\n",
       "      <td>-1.277194</td>\n",
       "      <td>0.213475</td>\n",
       "      <td>-0.065268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>2.400396</td>\n",
       "      <td>-1.720227</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>1.511598</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>-0.049015</td>\n",
       "      <td>-1.277194</td>\n",
       "      <td>-0.024889</td>\n",
       "      <td>2.061382</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3905</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-0.653841</td>\n",
       "      <td>1.284279</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-0.674182</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>-1.506426</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.536973</td>\n",
       "      <td>0.643615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3906</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.455047</td>\n",
       "      <td>-0.898180</td>\n",
       "      <td>-0.718725</td>\n",
       "      <td>1.771041</td>\n",
       "      <td>-1.220627</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>1.408395</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>1.529609</td>\n",
       "      <td>-0.065268</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3907</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.455047</td>\n",
       "      <td>1.545210</td>\n",
       "      <td>0.282777</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>2.058043</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>-0.777720</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.360576</td>\n",
       "      <td>0.643615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3908</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.455047</td>\n",
       "      <td>1.789549</td>\n",
       "      <td>1.284279</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-0.127737</td>\n",
       "      <td>-0.632399</td>\n",
       "      <td>-1.506426</td>\n",
       "      <td>0.782966</td>\n",
       "      <td>-0.252799</td>\n",
       "      <td>0.643615</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3909</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>-0.721400</td>\n",
       "      <td>-0.776011</td>\n",
       "      <td>0.282777</td>\n",
       "      <td>-0.725271</td>\n",
       "      <td>-1.220627</td>\n",
       "      <td>1.581280</td>\n",
       "      <td>-0.049015</td>\n",
       "      <td>-1.277194</td>\n",
       "      <td>-1.082511</td>\n",
       "      <td>-1.483035</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3910 rows × 26 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       0    1    2    3    4    5    6    7    8    9   ...        16  \\\n",
       "0     1.0  0.0  0.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "1     1.0  0.0  1.0  0.0  1.0  0.0  0.0  0.0  0.0  1.0  ... -0.721400   \n",
       "2     1.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "3     1.0  0.0  1.0  0.0  1.0  1.0  0.0  0.0  0.0  1.0  ... -0.721400   \n",
       "4     0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "...   ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...       ...   \n",
       "3905  1.0  0.0  0.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "3906  1.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  1.0  0.0  ...  1.455047   \n",
       "3907  0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  1.455047   \n",
       "3908  1.0  0.0  0.0  1.0  0.0  1.0  0.0  0.0  0.0  1.0  ...  1.455047   \n",
       "3909  0.0  0.0  1.0  0.0  1.0  0.0  0.0  0.0  0.0  0.0  ... -0.721400   \n",
       "\n",
       "            17        18        19        20        21        22        23  \\\n",
       "0    -1.020350  1.284279 -0.725271 -0.127737 -0.632399  0.679690  0.782966   \n",
       "1     0.690023  0.282777 -0.725271  1.511598 -0.632399  0.679690  0.782966   \n",
       "2    -1.020350  0.282777  1.771041  0.418708 -0.632399  0.679690  0.782966   \n",
       "3    -1.020350  1.284279 -0.725271 -0.127737 -0.632399  1.408395 -1.277194   \n",
       "4     2.400396 -1.720227 -0.725271  1.511598 -0.632399 -0.049015 -1.277194   \n",
       "...        ...       ...       ...       ...       ...       ...       ...   \n",
       "3905 -0.653841  1.284279 -0.725271 -0.674182 -0.632399 -1.506426  0.782966   \n",
       "3906 -0.898180 -0.718725  1.771041 -1.220627 -0.632399  1.408395  0.782966   \n",
       "3907  1.545210  0.282777 -0.725271  2.058043 -0.632399 -0.777720  0.782966   \n",
       "3908  1.789549  1.284279 -0.725271 -0.127737 -0.632399 -1.506426  0.782966   \n",
       "3909 -0.776011  0.282777 -0.725271 -1.220627  1.581280 -0.049015 -1.277194   \n",
       "\n",
       "            24        25  \n",
       "0    -0.382245 -0.774151  \n",
       "1    -0.459799  0.643615  \n",
       "2    -0.245196 -0.065268  \n",
       "3     0.213475 -0.065268  \n",
       "4    -0.024889  2.061382  \n",
       "...        ...       ...  \n",
       "3905 -0.536973  0.643615  \n",
       "3906  1.529609 -0.065268  \n",
       "3907 -0.360576  0.643615  \n",
       "3908 -0.252799  0.643615  \n",
       "3909 -1.082511 -1.483035  \n",
       "\n",
       "[3910 rows x 26 columns]"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(X_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3995    0\n",
       "2610    0\n",
       "3083    0\n",
       "3973    0\n",
       "4044    0\n",
       "       ..\n",
       "4426    0\n",
       "466     0\n",
       "3092    0\n",
       "3772    0\n",
       "860     1\n",
       "Name: ProdTaken, Length: 3910, dtype: int64"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y_train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.ensemble import RandomForestClassifier\n",
    "from sklearn.ensemble import GradientBoostingClassifier\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.metrics import accuracy_score, classification_report,ConfusionMatrixDisplay, \\\n",
    "                            precision_score, recall_score, f1_score, roc_auc_score,roc_curve "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Logisitic Regression\n",
      "Model performance for Training set\n",
      "- Accuracy: 0.8458\n",
      "- F1 score: 0.8200\n",
      "- Precision: 0.6994\n",
      "- Recall: 0.3032\n",
      "- Roc Auc Score: 0.6366\n",
      "----------------------------------\n",
      "Model performance for Test set\n",
      "- Accuracy: 0.8354\n",
      "- F1 score: 0.8078\n",
      "- Precision: 0.6829\n",
      "- Recall: 0.2932\n",
      "- Roc Auc Score: 0.6301\n",
      "===================================\n",
      "\n",
      "\n",
      "Decision Tree\n",
      "Model performance for Training set\n",
      "- Accuracy: 1.0000\n",
      "- F1 score: 1.0000\n",
      "- Precision: 1.0000\n",
      "- Recall: 1.0000\n",
      "- Roc Auc Score: 1.0000\n",
      "----------------------------------\n",
      "Model performance for Test set\n",
      "- Accuracy: 0.9254\n",
      "- F1 score: 0.9247\n",
      "- Precision: 0.8242\n",
      "- Recall: 0.7853\n",
      "- Roc Auc Score: 0.8723\n",
      "===================================\n",
      "\n",
      "\n",
      "Random Forest\n",
      "Model performance for Training set\n",
      "- Accuracy: 1.0000\n",
      "- F1 score: 1.0000\n",
      "- Precision: 1.0000\n",
      "- Recall: 1.0000\n",
      "- Roc Auc Score: 1.0000\n",
      "----------------------------------\n",
      "Model performance for Test set\n",
      "- Accuracy: 0.9305\n",
      "- F1 score: 0.9253\n",
      "- Precision: 0.9695\n",
      "- Recall: 0.6649\n",
      "- Roc Auc Score: 0.8299\n",
      "===================================\n",
      "\n",
      "\n",
      "Gradient Boost\n",
      "Model performance for Training set\n",
      "- Accuracy: 0.8939\n",
      "- F1 score: 0.8819\n",
      "- Precision: 0.8756\n",
      "- Recall: 0.5021\n",
      "- Roc Auc Score: 0.7429\n",
      "----------------------------------\n",
      "Model performance for Test set\n",
      "- Accuracy: 0.8589\n",
      "- F1 score: 0.8398\n",
      "- Precision: 0.7732\n",
      "- Recall: 0.3927\n",
      "- Roc Auc Score: 0.6824\n",
      "===================================\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "models={\n",
    "    \"Logisitic Regression\":LogisticRegression(),\n",
    "    \"Decision Tree\":DecisionTreeClassifier(),\n",
    "    \"Random Forest\":RandomForestClassifier(),\n",
    "    \"Gradient Boost\":GradientBoostingClassifier()\n",
    "}\n",
    "for i in range(len(list(models))):\n",
    "    model = list(models.values())[i]\n",
    "    model.fit(X_train, y_train) # Train model\n",
    "\n",
    "    # Make predictions\n",
    "    y_train_pred = model.predict(X_train)\n",
    "    y_test_pred = model.predict(X_test)\n",
    "\n",
    "    # Training set performance\n",
    "    model_train_accuracy = accuracy_score(y_train, y_train_pred) # Calculate Accuracy\n",
    "    model_train_f1 = f1_score(y_train, y_train_pred, average='weighted') # Calculate F1-score\n",
    "    model_train_precision = precision_score(y_train, y_train_pred) # Calculate Precision\n",
    "    model_train_recall = recall_score(y_train, y_train_pred) # Calculate Recall\n",
    "    model_train_rocauc_score = roc_auc_score(y_train, y_train_pred)\n",
    "\n",
    "\n",
    "    # Test set performance\n",
    "    model_test_accuracy = accuracy_score(y_test, y_test_pred) # Calculate Accuracy\n",
    "    model_test_f1 = f1_score(y_test, y_test_pred, average='weighted') # Calculate F1-score\n",
    "    model_test_precision = precision_score(y_test, y_test_pred) # Calculate Precision\n",
    "    model_test_recall = recall_score(y_test, y_test_pred) # Calculate Recall\n",
    "    model_test_rocauc_score = roc_auc_score(y_test, y_test_pred) #Calculate Roc\n",
    "\n",
    "\n",
    "    print(list(models.keys())[i])\n",
    "    \n",
    "    print('Model performance for Training set')\n",
    "    print(\"- Accuracy: {:.4f}\".format(model_train_accuracy))\n",
    "    print('- F1 score: {:.4f}'.format(model_train_f1))\n",
    "    \n",
    "    print('- Precision: {:.4f}'.format(model_train_precision))\n",
    "    print('- Recall: {:.4f}'.format(model_train_recall))\n",
    "    print('- Roc Auc Score: {:.4f}'.format(model_train_rocauc_score))\n",
    "\n",
    "    \n",
    "    \n",
    "    print('----------------------------------')\n",
    "    \n",
    "    print('Model performance for Test set')\n",
    "    print('- Accuracy: {:.4f}'.format(model_test_accuracy))\n",
    "    print('- F1 score: {:.4f}'.format(model_test_f1))\n",
    "    print('- Precision: {:.4f}'.format(model_test_precision))\n",
    "    print('- Recall: {:.4f}'.format(model_test_recall))\n",
    "    print('- Roc Auc Score: {:.4f}'.format(model_test_rocauc_score))\n",
    "\n",
    "    \n",
    "    print('='*35)\n",
    "    print('\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Hyperparameter Training\n",
    "rf_params = {\"max_depth\": [5, 8, 15, None, 10],\n",
    "             \"max_features\": [5, 7, \"auto\", 8],\n",
    "             \"min_samples_split\": [2, 8, 15, 20],\n",
    "             \"n_estimators\": [100, 200, 500, 1000]}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'max_depth': [5, 8, 15, None, 10],\n",
       " 'max_features': [5, 7, 'auto', 8],\n",
       " 'min_samples_split': [2, 8, 15, 20],\n",
       " 'n_estimators': [100, 200, 500, 1000]}"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rf_params"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Models list for Hyperparameter tuning\n",
    "randomcv_models = [\n",
    "                   (\"RF\", RandomForestClassifier(), rf_params)\n",
    "                   \n",
    "                   ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('RF',\n",
       "  RandomForestClassifier(),\n",
       "  {'max_depth': [5, 8, 15, None, 10],\n",
       "   'max_features': [5, 7, 'auto', 8],\n",
       "   'min_samples_split': [2, 8, 15, 20],\n",
       "   'n_estimators': [100, 200, 500, 1000]})]"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "randomcv_models"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fitting 3 folds for each of 100 candidates, totalling 300 fits\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[Parallel(n_jobs=-1)]: Using backend LokyBackend with 32 concurrent workers.\n",
      "[Parallel(n_jobs=-1)]: Done  98 tasks      | elapsed:    9.5s\n",
      "[Parallel(n_jobs=-1)]: Done 300 out of 300 | elapsed:   20.4s finished\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------------- Best Params for RF -------------------\n",
      "{'n_estimators': 1000, 'min_samples_split': 2, 'max_features': 7, 'max_depth': None}\n"
     ]
    }
   ],
   "source": [
    "from sklearn.model_selection import RandomizedSearchCV\n",
    "\n",
    "model_param = {}\n",
    "for name, model, params in randomcv_models:\n",
    "    random = RandomizedSearchCV(estimator=model,\n",
    "                                   param_distributions=params,\n",
    "                                   n_iter=100,\n",
    "                                   cv=3,\n",
    "                                   verbose=2,\n",
    "                                   n_jobs=-1)\n",
    "    random.fit(X_train, y_train)\n",
    "    model_param[name] = random.best_params_\n",
    "\n",
    "for model_name in model_param:\n",
    "    print(f\"---------------- Best Params for {model_name} -------------------\")\n",
    "    print(model_param[model_name])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Random Forest\n",
      "Model performance for Training set\n",
      "- Accuracy: 1.0000\n",
      "- F1 score: 1.0000\n",
      "- Precision: 1.0000\n",
      "- Recall: 1.0000\n",
      "- Roc Auc Score: 1.0000\n",
      "----------------------------------\n",
      "Model performance for Test set\n",
      "- Accuracy: 0.9315\n",
      "- F1 score: 0.9265\n",
      "- Precision: 0.9697\n",
      "- Recall: 0.6702\n",
      "- Roc Auc Score: 0.8325\n",
      "===================================\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "models={\n",
    "    \n",
    "    \"Random Forest\":RandomForestClassifier(n_estimators=1000,min_samples_split=2,\n",
    "                                          max_features=7,max_depth=None)\n",
    "}\n",
    "for i in range(len(list(models))):\n",
    "    model = list(models.values())[i]\n",
    "    model.fit(X_train, y_train) # Train model\n",
    "\n",
    "    # Make predictions\n",
    "    y_train_pred = model.predict(X_train)\n",
    "    y_test_pred = model.predict(X_test)\n",
    "\n",
    "    # Training set performance\n",
    "    model_train_accuracy = accuracy_score(y_train, y_train_pred) # Calculate Accuracy\n",
    "    model_train_f1 = f1_score(y_train, y_train_pred, average='weighted') # Calculate F1-score\n",
    "    model_train_precision = precision_score(y_train, y_train_pred) # Calculate Precision\n",
    "    model_train_recall = recall_score(y_train, y_train_pred) # Calculate Recall\n",
    "    model_train_rocauc_score = roc_auc_score(y_train, y_train_pred)\n",
    "\n",
    "\n",
    "    # Test set performance\n",
    "    model_test_accuracy = accuracy_score(y_test, y_test_pred) # Calculate Accuracy\n",
    "    model_test_f1 = f1_score(y_test, y_test_pred, average='weighted') # Calculate F1-score\n",
    "    model_test_precision = precision_score(y_test, y_test_pred) # Calculate Precision\n",
    "    model_test_recall = recall_score(y_test, y_test_pred) # Calculate Recall\n",
    "    model_test_rocauc_score = roc_auc_score(y_test, y_test_pred) #Calculate Roc\n",
    "\n",
    "\n",
    "    print(list(models.keys())[i])\n",
    "    \n",
    "    print('Model performance for Training set')\n",
    "    print(\"- Accuracy: {:.4f}\".format(model_train_accuracy))\n",
    "    print('- F1 score: {:.4f}'.format(model_train_f1))\n",
    "    \n",
    "    print('- Precision: {:.4f}'.format(model_train_precision))\n",
    "    print('- Recall: {:.4f}'.format(model_train_recall))\n",
    "    print('- Roc Auc Score: {:.4f}'.format(model_train_rocauc_score))\n",
    "\n",
    "    \n",
    "    \n",
    "    print('----------------------------------')\n",
    "    \n",
    "    print('Model performance for Test set')\n",
    "    print('- Accuracy: {:.4f}'.format(model_test_accuracy))\n",
    "    print('- F1 score: {:.4f}'.format(model_test_f1))\n",
    "    print('- Precision: {:.4f}'.format(model_test_precision))\n",
    "    print('- Recall: {:.4f}'.format(model_test_recall))\n",
    "    print('- Roc Auc Score: {:.4f}'.format(model_test_rocauc_score))\n",
    "\n",
    "    \n",
    "    print('='*35)\n",
    "    print('\\n')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEWCAYAAAB42tAoAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAABEIElEQVR4nO3dd3gUVffA8e+hh46AdAUk9KqAIoogAqKU1wIqqIAVpYiKiAXFjooFBUV+KOirYJdieVWaSlNBekekRFB6Cy3l/P64k7CEZDNAdjflfJ5nn8xOPTO7mbMzd+69oqoYY4wxackV6QCMMcZkbpYojDHGBGWJwhhjTFCWKIwxxgRlicIYY0xQliiMMcYEZYnCnBIRWSEiLSMdR2YhIo+KyNgIbXu8iDwbiW1nNBHpLiI/nOay9p0MMUsUWZiIbBSRwyJyUET+8U4chUO5TVWto6qzQrmNJCKSX0ReEJHN3n6uE5GHRETCsf1U4mkpIjGB41T1eVW9I0TbExHpLyLLRSRWRGJE5DMRqReK7Z0uERkqIh+eyTpU9SNVbetjWyclx3B+J3MqSxRZX0dVLQw0BBoBj0Q2nFMnInnSmPQZ0Bq4CigC3ALcBYwIQQwiIpnt/2EEcB/QHzgLqA5MAq7O6A0F+QxCLpLbNj6pqr2y6AvYCFwR8P4l4JuA9xcBc4G9wBKgZcC0s4BxwFZgDzApYFoHYLG33FygfsptAuWBw8BZAdMaATuBvN7724BV3vq/B84NmFeBPsA64K9U9q01cASolGL8hUACUM17Pwt4AfgN2AdMThFTsGMwC3gOmOPtSzWglxfzAWADcLc3byFvnkTgoPcqDwwFPvTmqeztVw9gs3csHgvYXhTwvnc8VgGDgJg0Pttobz+bBvn8xwOjgG+8eH8FzguYPgLYAuwHFgKXBkwbCnwOfOhNvwNoCszzjtU2YCSQL2CZOsCPwG7gX+BR4ErgGBDnHZMl3rzFgHe99fwNPAvk9qb19I75a966nvXGzfamizdtu/eZLgXq4n4kxHnbOwhMTfl/AOT24vrTOyYLSfEdstdpnGsiHYC9zuDDO/EfpCKwDBjhva8A7ML9Gs8FtPHel/amfwN8ApQA8gKXeePP9/5BL/T+6Xp428mfyjZnAHcGxPMyMNob/g+wHqgF5AEeB+YGzKveSecsICqVfRsG/JTGfm/i+Al8lnciqos7mX/B8RN3esdgFu6EXseLMS/u1/p53snqMuAQcL43f0tSnNhJPVH8Hy4pNACOArUC98k75hVxJ8C0EkVvYFM6n/943Im2qRf/R8DHAdNvBkp60x4E/gEKBMQd531Oubx4L8Al1jzevqwCBnjzF8Gd9B8ECnjvL0x5DAK2PQl4x/tMzsYl8qTPrCcQD/TzthXFiYmiHe4EX9z7HGoB5QL2+dkg/wcP4f4PanjLNgBKRvp/Nau/Ih6Avc7gw3P/IAdxv5wUmA4U96Y9DPw3xfzf40785XC/jEukss63gWdSjFvD8UQS+E95BzDDGxbcr9cW3vvvgNsD1pELd9I913uvwOVB9m1s4EkvxbT5eL/UcSf7YQHTauN+ceYOdgwCln06nWM8CbjPG26Jv0RRMWD6b8CN3vAGoF3AtDtSri9g2mPA/HRiGw+MDXh/FbA6yPx7gAYBcf+czvoHAF95wzcBi9KYL/kYeO/L4BJkVMC4m4CZ3nBPYHOKdfTkeKK4HFiLS1q5UtnnYIliDdD5TP+37HXiK7PdkzWn7j+qWgR3EqsJlPLGnwt0EZG9SS/gElySqATsVtU9qazvXODBFMtVwt1mSelzoJmIlAda4E6SvwSsZ0TAOnbjkkmFgOW3BNmvnV6sqSnnTU9tPZtwVwalCH4MUo1BRNqLyHwR2e3NfxXHj6lf/wQMHwKSHjAon2J7wfZ/F2nvv59tISIPisgqEdnn7UsxTtyXlPteXUS+9h6M2A88HzB/JdztHD/OxX0G2wKO+zu4K4tUtx1IVWfgbnuNAv4VkTEiUtTntk8lTuOTJYpsQlV/wv3aGu6N2oL7NV084FVIVYd5084SkeKprGoL8FyK5Qqq6sRUtrkX+AHoCnQDJqr3s85bz90p1hOlqnMDVxFkl6YBF4pIpcCRItIUdzKYETA6cJ5zcLdUdqZzDE6KQUTy425dDQfKqGpx4FtcgksvXj+24W45pRZ3StOBiiLS+HQ2JCKX4q6ouuKuHIvj7vcHPjGWcn/eBlYD0apaFHevP2n+LbhbcqlJuZ4tuCuKUgHHvaiq1gmyzIkrVH1DVS/A3RasjrullO5y6cRpTpMliuzldaCNiDTEFVJ2FJF2IpJbRAp4j3dWVNVtuFtDb4lICRHJKyItvHX8H9BbRC70ngQqJCJXi0iRNLY5AbgVuM4bTjIaeERE6gCISDER6eJ3R1R1Gu5k+YWI1PH24SLcffi3VXVdwOw3i0htESkIPA18rqoJwY5BGpvNB+QHdgDxItIeCHxk81+gpIgU87sfKXyKOyYlRKQC0DetGb39ewuY6MWcz4v/RhEZ7GNbRXDlADuAPCLyBJDer/IiuILtgyJSE7gnYNrXQFkRGeA9tlxERC70pv0LVE56asz7fv0AvCIiRUUkl4icJyKX+YgbEWniff/yArG4hxoSArZVNcjiY4FnRCTa+/7WF5GSfrZr0maJIhtR1R3AB8AQVd0CdMb9KtyB+6X1EMc/81twv7xX4wqvB3jrWADcibv034MrkO4ZZLNTcE/o/KuqSwJi+Qp4EfjYu42xHGh/irt0HTAT+B+uLOZD3JM0/VLM91/c1dQ/uILW/l4M6R2DE6jqAW/ZT3H73s3bv6Tpq4GJwAbvlkpqt+OCeRqIAf7CXTF9jvvlnZb+HL8Fsxd3S+UaYKqPbX2P+zGwFnc77gjBb3UBDMTt8wHcD4ZPkiZ4x6YN0BF3nNcBrbzJn3l/d4nIH97wrbjEuxJ3LD/H3600cAnt/7zlNuFuwyVdKb8L1PaO/6RUln0V9/n9gEt67+IKy80ZkON3CozJekRkFq4gNSK1o8+EiNyDK+j29UvbmEixKwpjwkREyolIc+9WTA3co6ZfRTouY9JjNSKNCZ98uKd/quBuJX2MK4cwJlOzW0/GGGOCsltPxhhjgspyt55KlSqllStXjnQYxhiTpSxcuHCnqpY+nWWzXKKoXLkyCxYsiHQYxhiTpYjIptNd1m49GWOMCcoShTHGmKAsURhjjAnKEoUxxpigLFEYY4wJyhKFMcaYoEKWKETkPRHZLiLL05guIvKGiKwXkaUicn6oYjHGGHP6QlmPYjyuieQP0pjeHtc8dTSuf+a3vb/GGJ+OxCWkP5MxZyhkiUJVfxaRykFm6Qx84PWINl9EiotIOa/TE2NMGnYePMrUJVv5atHfLI3ZF+lwTCZ34eZl3L5g8hmtI5I1sytwYkcqMd64kxKFiNwF3AVwzjnnhCW4nGrjzli6vDOP2KPxkQ7FpOFIXAKJCrXLFaV/62ii8uaOdEgmEyqwdxfNRr9Ize+/ZH/ZtDp19CeSiUJSGZdqU7aqOgYYA9C4cWNr7jaEPvp1E3tij9Hz4spIap+QibhC+fNwZd2y1CybXs+mJke7bhBMnwKPPELRxx+HQoVOe1WRTBQxnNi5fEVga4RiybH+2LyH579ZRXyiy79r/z1Am9pleLxD7QhHZow5ZStWQPHiUKECvPgiPP001KlzxquNZKKYAvQVkY9xhdj7rHwiYwz7bjXr/j3ga96/dsWyYUcsl1QrRa5cwkVVS9KnVbUQR2iMyVCxsfDMM/DKK9C9O4wfD9Uy7v84ZIlCRCYCLYFSIhIDPAnkBVDV0cC3wFXAeuAQ0CtUsWRVCYnKy9+vYXfsUd/LHItPZNLirVQuWZDCBdL/eAvmy83V9csx8qZGiN1rMibr+eYb6NMHNm2C225zVxIZLJRPPd2UznQF+oRq+1nBvsNx/HfeRg4eTf0Rx/1H4pjw62aKReWlYD7/BZZ1yhdlwh0XUaxg3owK1RiTGb31lksStWvDzz/DpZeGZDNZrj+KrGDvoWNMWvR38n3/1KjCJwu2sH77QfLlSbveY5ECeRjXqwnnn1MiFKEaY7Ka+HjYsQPKlYOuXeHwYejXD/LlC9kmLVFkoN2xx/hp7XbmrN/F5wtj0p3/rEL5mHjnRTQ7r2QYojPGZHm//QZ33w158sD8+VCqFDz4YMg3a4niNK3+Zz//7j+x7OCzBVv4eqkrjy+YLzc/D2pF/iBXCwXy5iZvbmtuyxiTjr174dFHYfRodyUxYgTkCt+5wxKFD6rK+u0HORqfCMCyv/fxyJfLUp23bNECfHL3RRSPymdlBMaYM7dsGbRp42439e/vHnktGt46NJYofJi6dBv9Jy46YVzDSsUZ0qEWKesNVigeRdliBcIYnTEmW4qLg7x5oXp1aNUKHnoIzo9M26mWKFKRmKjJBdHb9h1OThIvXlePswrlJ5fARVVLUii/HT5jTAY7etQ94vrhh/DHH1C4MEycGNGQ7EyXQmKi0uLlmcTsOXzC+BubVKJr40pW18AYEzozZsA998DatXDDDS5pFC4c6agsUaQ0c812YvYc5pJqpZKfRiqQNzfdmp5jScIYExqHD8Ndd7mriKpV4X//g3btIh1VMksUwNApK5i1ZjvgKsEBDOlQmxpli0QyLGNMTlGgAOzcCY8/7p5uioqKdEQnyNGJ4lh8In0m/MFPa3dQunB+Gld2ldoqlShI9TKRv9wzxmRjS5e6Aup334WKFV1THGF85PVU5MhEEZ+QyKiZf/L7xt3MXr+T6mUKM7BtDdrWKRvp0Iwx2V1sLAwdCq+9BiVKwLp1LlFk0iQBOTBRqCq9P1zItFXbiT67MA0qFefl6+tTvYzdZjLGhNiUKa65jc2b4c47YdgwOOusSEeVrhyXKP7df5Rpq7Zz92VVeaR9rUiHY4zJSSZNcpXlZs+G5s0jHY1vOS5RqNeJXpWSp9/bkzHG+BIXB2+84SrMnX++a3qjQAFXkS4LyTGJYu6fO9lx4Ch7Yo9FOhRjTE4wf75rwG/pUnj4YZcoimTNW9zZNlFs2X2ITbsOAXA4LoE7P1hwwvQShULXJK8xJgfbswceeQTGjHFdkn71FXTuHOmozki2TRTdxs5ny+4Ta1c/dlUtWtc6m7y5c1HprIIRiswYk62NGQNjx8L997unm7LoVUSgbJko9h2O48CReNrWLsOdLaoCkDd3LupVKEbuXFa72hiTwdasca27XnIJDBgA7dtD/fqRjirDZKtEkZCorP5nP51GziEhUalQIoomlTP/o2fGmCzqyBF44QX3mGvNmrB4MeTPn62SBGSjRDF/wy5uHvsr8YlK/jy5eKpTHdrULhPpsIwx2dWPP8K998L69dCtG7zyCmTT9uCCJgoRKQB0AC4FygOHgeXAN6q6IvTh+ffRr5spUiAPtzWvQp0KRbm8piUJY0yI/PwztG0L0dEuYVxxRaQjCqk0E4WIDAU6ArOAX4HtQAGgOjDMSyIPqurS0IeZvp0HjhJ9dhH6tY6OdCjGmOwoIQFWroR69eDSS10bTd26uXoR2VywK4rfVXVoGtNeFZGzgXMyPiRjjMlkFi2C3r1h1SrXNlOZMnDbbZGOKmzSbIVKVb8JfC8ihVJM366qJ1ZOiJD5G3Yxb8Ou5FrXxhiTIQ4cgAcegMaNYeNGePttOPvsSEcVduk2VygiF4vISmCV976BiLwV8shOwZQlWwFoXcvKJYwxGWTfPqhTx7XyeuedsHo1dO+ebQusg/Hz1NNrQDtgCoCqLhGRFiGN6hRNW/kvJQrmpfdl50U6FGNMVrd/v2u4r1gx1+tc69bQrFmko4ooXw2gq+qWFKMSQhDLadl18CjbDxxl/5H4SIdijMnK4uLgpZdc3xB//OHGPf54jk8S4O+KYouIXAyoiOQD+uPdhsoMhv+wFoCnO9eJcCTGmCxrzhxXWL18OfznP1C6dKQjylT8XFH0BvoAFYAYoCFwbwhj8m1P7DEm/rYZgOsvqBjhaIwxWVK/fq7pjX37YPJk14hfpUqRjipT8XNFUUNVuweOEJHmwJzQhOTfo18tA6BD/XLkz5M7wtEYY7IM1eOF0mXLwsCB8OSTULhwZOPKpPxcUbzpc1zYHTwaT+5cwovXZa92VYwxIbR6tetIaPJk9/6xx+Dlly1JBBGsZnYz4GKgtIg8EDCpKJBpfr7Xq1CMQvmzTZNVxphQOXwYnn8eXnwRChVy740vwa4o8gGFccmkSMBrP3C9n5WLyJUiskZE1ovI4FSmFxORqSKyRERWiEgvv4HvOxTHL+t2kpBoleyMMemYPt01vfHss3Djja5Z8BtvjHRUWUaaP8VV9SfgJxEZr6qbTnXFIpIbGAW0wRWC/y4iU1R1ZcBsfYCVqtpRREoDa0TkI1VNt7/SXbFHAWhYqfiphmaMyWliYiBPHpcwLr880tFkOX7u2RwSkZeBOrhGAQFQ1fSOdlNgvapuABCRj4HOQGCiUKCIiAju6mU3cEoVIhpXLnEqsxtjcoKEBBg9GvLlc7Wqb73VXUHkzx/pyLIkP4XZHwGrgSrAU8BG4Hcfy1UAAivqxXjjAo0EagFbgWXAfaqamHJFInKXiCwQkQU7duzwsWljTI71xx9w0UXQty98/70bJ2JJ4gz4SRQlVfVdIE5Vf1LV24CLfCyXWoMoKQsU2gGLcX1dNARGikjRkxZSHaOqjVW1cWmvIszCTXt8hGCMyTH274f77oMmTWDLFpg4ET77LNJRZQt+EkWc93ebiFwtIo0AP7XbYoDAWisVcVcOgXoBX6qzHvgLqOlj3Qz7bjUAtcudlFeMMTnRkiUwcqSrYb16tbvVlAMb8AsFP2UUz4pIMeBBXP2JosAAH8v9DkSLSBXgb+BGoFuKeTYDrYFfRKQMUAPY4CfwXbHHyJNLiC5TxM/sxpjs6K+/YOZM1zfEpZe6bkmrVIl0VNlOuolCVb/2BvcBrSC5ZnZ6y8WLSF/ge1y9i/dUdYWI9PamjwaeAcaLyDLcraqHVXWn3+DvbVXN76zGmOzk2DHXR/XTT7se5q65BkqUsCQRIsEq3OUGuuIKoP+nqstFpAPwKBAFNEpv5ar6LfBtinGjA4a3Am1PL3RjTI70yy/u9tLKlXDttTBihEsSJmSCXVG8iytj+A14Q0Q2Ac2Awao6KQyxGWPMiXbsgLZtXVekU6dChw6RjihHCJYoGgP1VTVRRAoAO4FqqvpPeEIzxhhcA37TpkGbNq7576+/do+/FiqU/rImQwR76ulYUp0GVT0CrLUkYYwJqxUr4LLL3FXErFluXOvWliTCLNgVRU0RWeoNC3Ce914AVVVrstUYExqHDrl2mV5+2XVLOnYstMhUPTDnKMESRa2wRWGMMUlUXTPgv/0GPXq4ZGE9zkVUsEYBT7khQGOMOW3btsHZZ0Pu3PDoo1CsGLRsGemoDP5qZhtjTOgkJMAbb0CNGvDWW25c586WJDIRSxTGmMhZsACaNnVtNF18MVx1VaQjMqnwlShEJEpEaoQ6GGNMDvLSSy5JbNsGn3wC330H550X6ahMKtJNFCLSEdfC6/+89w1FZEqI4zLGZEeqEOe1M9q0KfTpA6tWQdeu1oBfJubnimIorhOivQCquhioHKqAjDHZ1J9/wpVXwmCvV+SWLeHNN12htcnU/CSKeFXdF/JIjDHZ09Gjrk5E3bowb57dXsqC/DQzvlxEugG5RSQa6A/MDW1YwR2JS4jk5o0xfi1cCDff7PqH6NIFXn8dypePdFTmFPm5ouiH6y/7KDAB19z4gBDGlK7Ji/8GIH8ee2jLmEytcGFX9vDtt/Dpp5Yksig/VxQ1VPUx4LFQB+PXkTjXrfYNTSqlM6cxJqwSE2HcOHeLaexYVzdi+XLIZT/qsjI/n96rIrJaRJ4RkTohj+gU5LKnJIzJPJYvd+0x3XEHrFsHsbFuvCWJLC/dT1BVWwEtgR3AGBFZJiKPhzowY0wWERsLDz8MjRq5sohx41xLr9bCa7bhK9Wr6j+q+gbQG1en4olQBmWMyUKOHHHJ4dZbYc0a6NnT6kRkM34q3NUSkaEishwYiXviqWLIIzPGZF4xMTBokGunqWRJdyXx7rtu2GQ7fq4oxgF7gLaqepmqvq2q20McV1AfzreGbY2JiPh4eO01qFULRo6ExYvd+LPOimhYJrTSfepJVS8KRyCnIi7BPfVUomDeCEdiTA7y669w992wZIlrvG/kSKhSJdJRmTBIM1GIyKeq2lVElgEaOIkI93CXS4SODcojdh/UmPBITIRevWDfPvj8c7j2WiuHyEGCXVHc5/3tEI5AjDGZjKpLCldeCUWKwJdfQoUKbtjkKGmWUajqNm/wXlXdFPgC7g1PeMaYiFi3Dtq1c626jhnjxtWsaUkih/JTmN0mlXHtMzoQY0wmcPQoPP001KvnyiRGjoQBAyIdlYmwYGUU9+CuHKqKyNKASUWAOaEOzBgTAX36uMdcb7wRXn0VypWLdEQmEwhWRjEB+A54ARgcMP6Aqu4OaVTGmPDZvt0VVpct62pYd+nibjsZ4wl260lVdSPQBzgQ8EJE7KFpY7K6xERX/lCjhuuzGiA62pKEOUl6VxQdgIW4x2MDn4VToGoI4zLGhNLSpdC7t2vltWVLeOqpSEdkMrE0E4WqdvD+Wo0aY7KTzz93ZRAlSsAHH7iOhaxOhAnCT1tPzUWkkDd8s4i8KiLnhD601B2NT2TDzlgSEhMjFYIxWdP+/e5vy5au0HrNGrjlFksSJl1+Ho99GzgkIg2AQcAm4L8hjSqIf/cfAaBZVWt8zBhfNm+Gzp2hdWvXiF+pUjBihLXPZHzzkyjiVVWBzsAIVR2Be0Q2XSJypYisEZH1IjI4jXlaishiEVkhIj+lt85Eda2JdGlsvdsZE1RcHAwf7hrwmzbNVZ5TTX85Y1Lw0xXqARF5BLgFuFREcgPptsbnzTcKV2EvBvhdRKao6sqAeYoDbwFXqupmETnbT9B1KxSlQN7cfmY1JmfatAk6dXKF1h07wptvwrnnRjoqk0X5uaK4ATgK3Kaq/wAVgJd9LNcUWK+qG1T1GPAx7qokUDfgS1XdDBDp5suNyfKSrhjKloUyZeCrr2DyZEsS5oz46Qr1H+AjoJiIdACOqOoHPtZdAdgS8D7GGxeoOlBCRGaJyEIRudVn3MaYQKrw4YfQpAkcPAj588MPP8B//mOF1eaM+XnqqSvwG9AF6Ar8KiLX+1h3at/OlDdI8wAXAFcD7YAhIlI9lRjuEpEFIrLg2LFjPjZtTA6yZo0rqL7lFsiTB3btinREJpvxU0bxGNAk6baQiJQGpgGfp7NcDBBY4lwR2JrKPDtVNRaIFZGfgQbA2sCZVHUMMAagVJVaVhpnDLje5p55BoYNg6goePttuOsuyOXnjrIx/vn5RuVKUXawy+dyvwPRIlJFRPIBNwJTUswzGVdAnkdECgIXAqt8rNsYkzs3/PILXH+9u6ro3duShAkJP1cU/xOR74GJ3vsbgG/TW0hV40WkL/A9kBt4T1VXiEhvb/poVV0lIv8DlgKJwFhVXX46O2JMjvDPP/Doo67JjUqV4NtvoUCBSEdlsjk/fWY/JCLXApfgyh3GqOpXflauqt+SIqmo6ugU71/G31NUxuRcCQmuAb9HHoHDh6F9e5coLEmYMAjWH0U0MBw4D1gGDFTVv8MVmDHGs2iRu63022+u0Pqtt6D6Sc98GBMywW5ovgd8DVyHa0H2zbBEZIw50ciRsHEjfPQR/PijJQkTdsFuPRVR1f/zhteIyB/hCMiYHE8VJk2CypWhUSPXDMfw4a61V2MiINgVRQERaSQi54vI+UBUivfGmIy2caNreuPaa+H11924EiUsSZiICnZFsQ14NeD9PwHvFbg8VEEZk+PExbk+qp96yj3iOnz48V7njImwYB0XtQpnIMbkaO+8A4MHuyY3RoyAcyLW5YsxJ/FTj8IYEwq7drlbTRdcAHfeCdWqwZVXRjoqY05i1TiNCTdVeP99qFkTunRxTXHkz29JwmRaliiMCadVq6BVK+jZE6Kj3dNNeezC3mRu6X5DRUSA7kBVVX3a6y+7rKr+FvLojMlOlixxzYAXLuxqWd9+u7XNZLIEP9/St4BmwE3e+wO4nuuMMX7ExLi/9eu7p5pWr3ZlEpYkTBbh55t6oar2AY4AqOoeIF9IozImO9i6FW64wfVZ/fffrgOhRx6Bs331+GtMpuEnUcR5/V8rJPdHkRjSqIzJyhISXLMbtWq5bkgHDYJSpSIdlTGnzU8p2hvAV8DZIvIccD3weEijMiarOnIEWrSA33+HNm1cA37VqkU6KmPOiJ9mxj8SkYVAa1wz4/9RVetcyJhAcXGQN69r9rtVK3jgAXfbyfqrNtmAnz6zzwEOAVNxPdTFeuOMMarw+efuquEPr93MF1+EG2+0JGGyDT+3nr7BlU8IUACoAqwB6oQwrjQdOBJPopWQmMxgwwbo2xe++8618mpPMZlsys+tp3qB772WY+8OWUQ+VCgRFcnNG+Ma8HvsMVdZ7vXXoU8fqzhnsq1T/mar6h8i0iQUwfjVtnaZSG7eGDh4EK66yjXgV7FipKMxJqT81Mx+IOBtLuB8YEfIIjImM9q5Ex56CK65xvUX8fjjdqvJ5Bh+riiKBAzH48osvghNOMZkMomJMH68SxL790M9706sJQmTgwRNFF5Fu8Kq+lCY4jEm81i5Enr3hl9+gUsugdGjoU5EnuEwJqKCJgpVTbBuT02OtWABrFgB777rWnu1qwiTQ6WZKEQkj6rGA4tFZArwGRCbNF1VvwxDfMaE17ffug6FbrnFvTp0gLPOinRUxkRUsJ9ISc2InwXswvWR3dF7dQhxXMaEV0wMXH89XH21a6dJ1VWYsyRhTNBbTwKgqr3CFIsx4RcfD6NGuaeY4uPhuedg4ECrVW1MgGCJonSKR2NPoKqvhiAeY8Jr4UIYMMB1QzpqFFStGumIjMl0giWK3EBhvCsLY7KNfftg+nS49lq48EL49VfX85xdRRiTqmCJYpuqPh22SIwJNVX49FN3BbFrF2zcCOXLQ9OmkY7MmEwtWGG2/bwy2ceff0L79q5V1woVYO5clySMMekKdkXRKb2FRaSwqh7MwHiMyXgHDsAFF7ha1m+8AffeC7lzRzoqY7KMYFcU40XkFRFpISKFkkaKSFURuV1EvgeuDH2IxpympUvd3yJFXKW5VaugXz9LEsacojQThaq2BqbjmhRfISL7RGQX8CFQFuihqp+HJ0xjTsGOHdCjBzRo4CrQAVx3nbvlZIw5Zek14fEt8O3prlxErgRG4J6gGquqw9KYrwkwH7jBko85bYmJ8N57MGiQawb80UehZctIR2VMluenK9TPReQqETmlhm68BgVHAe2B2sBNIlI7jfleBL4/lfUbc5LrroM773QtvC5e7CrPFSwY6aiMyfL8nPxHA92BdSIyTERq+lx3U2C9qm5Q1WPAx0DnVObrh2u2fLvP9RpzXGysq1ENcNNNrknwWbOg9km/SYwxpyndRKGq01S1O67Doo3AjyIyV0R6iUjeIItWALYEvI/xxiUTkQrANbhklCYRuUtEFojIgvTiNTnI1KkuIbz1lnvftasrm7CKc8ZkKF+3k0SkJNATuANYhCt3OB/4MdhiqYzTFO9fBx5W1YRg21fVMaraWFUb+4nXZHNbtrha1Z06uSeaLrgg0hEZk6356Qr1S6Am8F+go6pu8yZ9ks4v/BigUsD7isDWFPM0Bj4W9wuwFHCViMSr6iR/4Zsc58MPXWdCiYkwbBjcfz/kyxfpqIzJ1vx0hTrWe/opmYjkV9Wj6fzC/x2IFpEqwN/AjUC3wBlUtUrAOscDX1uSMKlKava7YkX3JNObb0KVKukuZow5c35uPT2byrh56S3kdXrUF/c00yrgU1VdISK9RaT3qYVpcqy9e+Gee1yf1eCSxNdfW5IwJoyC9XBXFlf4HCUijThe5lAU8PXMYWr1MFQ11YJrVe3pZ50mh1CFiRPhgQdcBbr77z9+VWGMCatgt57a4QqwKwKBfU8cAB4NYUwmp/vrL7jrLpg2zTX//d130KhRpKMyJsdKM1Go6vvA+yJynap+EcaYTE4XF+faaRo1Cu6+29pmMibCgt16ullVPwQqp9bTnfVwZzLU9OnwzTfw6qtQvTps2gQFCkQ6KmMMwQuzk1qMLQwUSeVlzJn791+4+Wa44gqYMsV1KASWJIzJRILdenrHG3xLVXeEKR6TUyQmwv/9Hwwe7JrhGDIEHnkEoqIiHZkxJgU/9SjmishfwCfAl6q6J8QxmZxg3z54/HFo2BDefhtq+m1CzBgTbn7aeooGHgfqAAtF5GsRuTnkkZns5+BBVwaRkAAlSsCvv8KMGZYkjMnkfLX1pKq/qeoDuBZhdwPvhzQqk/1Mnuwa8HvwQfjpJzeualWrF2FMFuCnP4qiItJDRL4D5gLbcAkjYgrn93PHzGQKmzZB587wn/9A8eIwZw5cfnmkozLGnAI/Z9wlwCTgaVVNt+mOcGhbp2ykQzB+qML118PKlfDSSzBgAOQN1jK9MSYz8pMoqqpqyubBIyaXCLlz2e2KTG3+fKhTxzUBPmYMnHUWnHtupKMyxpymNG89icjr3uAUETnpFZ7wTJaye7erSd2sGQwf7sY1amRJwpgsLtgVxX+9v8PDEYjJwlRdPxEPPuiSxYMPHm/t1RiT5QWrcLfQG2yoqiMCp4nIfcBPoQzMZCGPPuo6EbroIvjxR2jQINIRGWMykJ/HY3ukMq5nBsdhspojR2DnTjfcq5erNDdnjiUJY7KhYI0C3oTrka5KijKJIsCuUAdmMrEff4R774W6deGrr1wjftWrRzoqY0yIBCujSKozUQp4JWD8AWBpKIMymdQ//7iOhCZOhOho6Ns30hEZY8IgWBnFJmAT0Cx84ZhMa+ZMuOYaOHwYhg6Fhx+2Fl6NySGC3XqaraqXiMgBILAehQCqqkVDHp2JvLg4V0mufn1o0waee85uMxmTw0gmqkvnS1T56np469pIh5H9HTgATzwB8+a5QmrrZc6YLE1EFqpq49NZ1k9bT+eJSH5vuKWI9BeR4qezMZMFqMKXX0KtWjBihKswd/RopKMyxkSQn8djvwASRKQa8C5QBZgQ0qhMZOzcCR07wnXXQalSMHeue+y1YMFIR2aMiSA/iSJRVeOBa4DXVfV+oFxowzIRUaSI65r01VdhwQJXgc4Yk+P5SRRxXp2KHsDX3jhrAjS7mD0b2rd3nQrlz+86E7r/fshjTbkbYxw/iaIX7hHZ51T1LxGpAnwY2rBMyO3aBXfcAZde6poB37DBjc/lqy8rY0wOYk895TSq8P77MHAg7N3rKtA9+SQUKhTpyIwxIXQmTz2le39BRJoDQ4FzvfmT6lFUPZ0Nmkzggw+gRg0YPRrq1Yt0NMaYTM7Pjeh3gfuBhUBCaMMxIXH4sGvd9c47oWJF+OILKFbMbjMZY3zxkyj2qep3IY/EhMb337sG/DZsgLPPhj59oESJSEdljMlC/CSKmSLyMvAlkFzzSlX/CFlU5sxt3eqeXvr0U3ebacYMaNUq0lEZY7IgP4niQu9vYCGIApdnfDgmwzz7LEyeDE8/DYMGuUdfjTHmNNhTT9nJwoXHG/DbtQv27IFq1SIdlTEmEwh1W09lRORdEfnOe19bRG73GdiVIrJGRNaLyOBUpncXkaXea66IWPdop2P/fujfH5o2dd2SApQsaUnCGJMh/Dz2Mh74HijvvV8LDEhvIRHJDYwC2gO1gZtEpHaK2f4CLlPV+sAzwBhfURtHFT77DGrWhJEj4Z574EOrC2mMyVh+EkUpVf0USATw2n3y85hsU2C9qm5Q1WPAx0DnwBlUda6q7vHezgcq+o7cwIQJ0LUrlC3rmt4YORKKF490VMaYbMZPYXasiJTE67xIRC4C9vlYrgKwJeB9DMcLxlNzO5DqY7gichdwF0D+sjn8dsqxY+5R15o14frrXR2Jnj2tbSZjTMj4Obs8AEwBzhOROUBp4Hofy0kq41ItOReRVrhEcUlq01V1DN5tqajy1bNW6XtG+vln6N3bNeC3dq3rivSOOyIdlTEmm0s3UajqHyJyGVADd/Jfo6pxPtYdA1QKeF8R2JpyJhGpD4wF2qvqLl9R5zQ7d8JDD8H48VC5smt6w/qrNsaESbA+s5sAW1T1H1WNF5ELgOuATSIyVFV3p7Pu34For7XZv4EbgW4ptnEOriLfLapqz7ymZsMGaNLEPdk0eDAMGWIdCRljwipYYfY7wDEAEWkBDAM+wJVPpPt0klfo3Rf3xNQq4FNVXSEivUWktzfbE0BJ4C0RWSwiC057T7Kb/fvd3ypVoFcvWLQIXnjBkoQxJuzSrHAnIktUtYE3PArYoapDvfeLVbVhuIIMlO0r3B06BM88A2PGwJIlrhE/Y4w5Q6GqcJdbRJJuTbUGZgRMs0dsQuGbb6BOHdfSa+fOEBUV6YiMMSboCX8i8JOI7AQOA78AiEg1/D0ea/yKj4ebboLPP4dateCnn6BFi0hHZYwxQJBEoarPich0oBzwgx6/R5UL6BeO4LI9VRBxdSDKlIHnn4cHH4R8+SIdmTHGJLNGASPl999d3xCjR8P550c6GmNMNhfSRgFNBtu3D/r2hQsvhJgY18qrMcZkYpYowimpAb+333bJYvVqaNMm0lEZY0xQ9vRSOK1aBRUqwNSp0Pi0rgCNMSbsrIwilI4ehZdfhgYNoGNHiIuDXLkgd+5IR2aMyWGsjCIzmjnTJYghQ2D6dDcub15LEsaYLMcSRUbbvh169IDLL3dXEN99B6+/HumojDHmtFmiyGg//AATJ8Jjj8Hy5XDllZGOyBhjzogVZmeEZctgzRrXkVD37nDxxVC1aqSjMsaYDGFXFGciNhYGDYJGjdzfuDhX09qShDEmG7EritM1daqrC7F5M9x+O7z4oiuszkTi4uKIiYnhyJEjkQ7FGBMmBQoUoGLFiuTNwPORJYrTsXw5dOrkWnr95Re4JNUeXCMuJiaGIkWKULlyZURS65nWGJOdqCq7du0iJiaGKlWqZNh67daTX/HxMGuWG65bF77+2nUmlEmTBMCRI0coWbKkJQljcggRoWTJkhl+F8EShR+//upqUrduDevWuXFXX53pbjWlxpKEMTlLKP7nLVEEs2cP3HMPNGsGO3e6tpqqVYt0VMYYE1aWKNJy9Kh7mmnMGBgwwLXTdO217qkm41vu3Llp2LAhdevWpWPHjuzduzdD1jt+/Hj69u2bIesK1LJlS2rUqEHDhg1p2LAhn3/+eYZvA2Djxo1MmDAhzelr167lqquuolq1atSqVYuuXbvy77//MmvWLDp06JBhcdxxxx2sXLkSgM8++4xatWrRqlUrFixYQP/+/c9o3cE++xUrVnD55ZdTvXp1oqOjeeaZZwhsTui7776jcePG1KpVi5o1azJw4MBUtzFp0iSefvrpM4ozlHbv3k2bNm2Ijo6mTZs27NmzJ9X5XnvtNerUqUPdunW56aabkm8dDRkyhPr169OwYUPatm3L1q1bAVi2bBk9e/YM1264wo+s9CpQLlpDKibm+PC4cap//BHa7YXQypUrIx2CFipUKHn41ltv1WeffTZD1jtu3Djt06dPhqwr0GWXXaa///77KS8XFxd3SvPPnDlTr7766lSnHT58WKtVq6ZTpkxJHjdjxgxdtmxZ0OXOVLt27XTGjBmntWxq+5/WZ3/o0CGtWrWqfv/996qqGhsbq1deeaWOHDlSVVWXLVumVatW1VWrViWve9SoUalut1mzZrpjx44zijOUHnroIX3hhRdUVfWFF17QQYMGnTRPTEyMVq5cWQ8dOqSqql26dNFx48apquq+ffuS5xsxYoTefffdye9bt26tmzZtSnW7qf3vAwv0NM+79tRTkiNH3COuzz8Pn37q+qwOZ8YOsaemrmDl1v0Zus7a5YvyZMc6vudv1qwZS5cuBeC3335jwIABHD58mKioKMaNG0eNGjUYP348U6ZM4dChQ/z5559cc801vPTSSwCMGzeOF154gXLlylG9enXy588PwKZNm7jtttvYsWMHpUuXZty4cZxzzjn07NmTqKgoVq9ezaZNmxg3bhzvv/8+8+bN48ILL2T8+PG+4t69eze33XYbGzZsoGDBgowZM4b69eszdOhQtm7dysaNGylVqhQjRoygd+/ebN68GYDXX3+d5s2b89NPP3HfffcB7v7xzz//zODBg1m1ahUNGzakR48e3H///cnbmzBhAs2aNaNjx47J41q1agXArKQHKoIcwxUrVtCrVy+OHTtGYmIiX3zxBeXLl6dr167ExMSQkJDAkCFDuOGGG2jZsiXDhw/n22+/Zfbs2fz111906tSJq6++muHDh/P1118TGxtLv379WLZsGfHx8QwdOpTOnTszfvx4vvnmG44cOUJsbCwzZszw9dlPmDCB5s2b07ZtWwAKFizIyJEjadmyJX369OGll17iscceo2bNmgDkyZOHe++996R1rl27lvz581OqVCkApk6dyrPPPsuxY8coWbIkH330EWXKlPH9OaV1PM/E5MmTkz+zHj160LJlS1588cWT5ouPj+fw4cPkzZuXQ4cOUb58eQCKFi2aPE9sbOwJ5Q8dO3bk448/ZtCgQWcUox+WKMA12nfPPa6g+qabXKdCJkMlJCQwffp0br/9dgBq1qzJzz//TJ48eZg2bRqPPvooX3zxBQCLFy9m0aJF5M+fnxo1atCvXz/y5MnDk08+ycKFCylWrBitWrWiUaNGAPTt25dbb72VHj168N5779G/f38mTZoEwJ49e5gxYwZTpkyhY8eOzJkzh7Fjx9KkSRMWL15Mw4YNT4q1e/fuREVFATB9+nSGDh1Ko0aNmDRpEjNmzODWW29l8eLFACxcuJDZs2cTFRVFt27duP/++7nkkkvYvHkz7dq1Y9WqVQwfPpxRo0bRvHlzDh48SIECBRg2bFjyiTil5cuXc8EFF6R7TNM6hqNHj+a+++6je/fuHDt2jISEBL799lvKly/PN998A8C+fSd2e//EE08wY8YMhg8fTuPGjU9ISM899xyXX3457733Hnv37qVp06ZcccUVAMybN4+lS5dy1llnpRlnys9+xYoVJ+3feeedx8GDB9m/fz/Lly/nwQcfTHf/58yZw/kBvUNecsklzJ8/HxFh7NixvPTSS7zyyiuAv88p2HcyyYEDB7j00ktTjWfChAnUrl37hHH//vsv5cqVA6BcuXJs3779pOUqVKjAwIEDOeecc4iKiqJt27bJSRTgscce44MPPqBYsWLMnDkzeXzjxo0ZNmyYJYqwGDAARoxwhdQ//JBtOxI6lV/+Genw4cM0bNiQjRs3csEFF9DGO7779u2jR48erFu3DhEhLi4ueZnWrVtTrFgxAGrXrs2mTZvYuXMnLVu2pHTp0gDccMMNrF3rmpufN28eX375JQC33HLLCf84HTt2RESoV68eZcqUoV69egDUqVOHjRs3ppooPvroIxoH9Bcye/bs5BPG5Zdfzq5du5JPtJ06dUpOKtOmTUu+3w+wf/9+Dhw4QPPmzXnggQfo3r071157LRUrVjyDI3pcWsewWbNmPPfcc8TExHDttdcSHR1NvXr1GDhwIA8//DAdOnRI82SXmh9++IEpU6YwfPhwwD12nfRrvE2bNmkmibQ+e1VN88mcU3liZ9u2bcnfB3D1hm644Qa2bdvGsWPHTqhH4OdzCvadTFKkSJHkHwkZZc+ePUyePJm//vqL4sWL06VLFz788ENuvvlmwCXq5557jhdeeIGRI0fy1FNPAXD22Wcnl1mEWs4szE5MhIQEN9y0KTzxhGuvKZsmiUiKiopi8eLFbNq0iWPHjjFq1CjAFdK1atWK5cuXM3Xq1BOe+066pQSuQDQ+Ph7wfxIJnC9pXbly5Tphvbly5Upeb3o0lT5bkrZRqFCh5HGJiYnMmzePxYsXs3jxYv7++2+KFCnC4MGDGTt2LIcPH+aiiy5i9erVQbdXp04dFi5cmG5caR3Dbt26MWXKFKKiomjXrh0zZsygevXqLFy4kHr16vHII4+cUgGwqvLFF18k79fmzZupVavWSfufUlqffZ06dViwYMEJ827YsIHChQtTpEgR3/sfFRV1wvemX79+9O3bl2XLlvHOO++cMM3P5xTsO5nkwIEDyQ86pHwFJp8kZcqUYdu2bYBLbGefffZJ80ybNo0qVapQunRp8ubNy7XXXsvcuXNPmq9bt24nXOEcOXIkOfmFWs5LFEuWuEb7vC8t3brBU09BgQKRjSubK1asGG+88QbDhw8nLi6Offv2UaFCBQBfZQUXXnghs2bNYteuXcTFxfHZZ58lT7v44ov5+OOPAXc1cEkGV4Js0aIFH330EeDKCEqVKnXCveMkbdu2ZeTIkcnvk355/vnnn9SrV4+HH36Yxo0bs3r1aooUKcKBAwdS3V63bt2YO3du8m0igP/9738sW7bshPnSOoYbNmygatWq9O/fn06dOrF06VK2bt1KwYIFufnmmxk4cCB//PGH7/1v164db775ZnLCXLRoke9l4eTPvnv37syePZtp06YB7sqjf//+yVeCDz30EM8//3zyFWNiYiKvvvrqSeutVasW69evT/V4vP/++2nGk9bn5Oc7mXRFkdor5W0ncFcySbG8//77dO7c+aR5zjnnHObPn8+hQ4dQVaZPn56ciNcl1dsCpkyZklxuA66Mpm7dumnuZ0bKOYni4EF48EG44ALYsAHKlo10RDlOo0aNaNCgQXIB3COPPELz5s1JSLq6C6JcuXIMHTqUZs2accUVV5xwb/qNN95g3Lhx1K9fn//+97+MGDEiQ+MeOnQoCxYsoH79+gwePDjNk9Abb7yRPF/t2rUZPXo04ApL69atS4MGDYiKiqJ9+/bUr1+fPHny0KBBA1577bUT1hMVFcXXX3/Nm2++SXR0NLVr12b8+PEn/RpN6xh+8skn1K1bl4YNG7J69WpuvfVWli1bRtOmTWnYsCHPPfccjz/+uO/9HzJkCHFxcdSvX5+6desyZMgQ38smCfzso6KimDx5Ms8++yw1atSgXr16NGnSJPlx5/r16/P6669z0003UatWLerWrZv8qzxQixYtWLRoUXICGzp0KF26dOHSSy9NLuBOTVqf06l+J/0YPHgwP/74I9HR0fz4448MHjwYgK1bt3LVVVcB7kfQ9ddfz/nnn0+9evVITEzkrrvuSl6+bt261K9fnx9++OGE7/bMmTO5+uqrMyTO9OSMrlCnTYNevSAmBu66C4YNgxIlQhNgJrJq1arkXybGZEf33XcfHTt2TC5czymOHj3KZZddxuzZs8mT5+Si5tT+960r1PTkywdnnQVz5sA77+SIJGFMTvDoo49y6NChSIcRdps3b2bYsGGpJolQyJ5PPcXFue5H9+2DZ5+FFi1cA365ckZeNCanKFOmDJ06dYp0GGEXHR1NdHR02LaX/c6cc+e6cohBg1yzG4mJbnwOTRJZ7daiMebMhOJ/PvucPXfvduUPzZvD3r0waRJ88UWOTRDgOjDZtWuXJQtjcgj1+qMokMFPcWafW0+7dsGECTBwIDz5JBQuHOmIIq5ixYrExMSwY8eOSIdijAmTpB7uMlLWThRr1sAnn7gKc9HRsGkTlCwZ6agyjbx582ZoL1fGmJwppPdlRORKEVkjIutFZHAq00VE3vCmLxWR81Nbz0kOH3bJoX59eO012LLFjbckYYwxGS5kiUJEcgOjgPZAbeAmEUlZdbE9EO297gLeTm+9hY/GQr168Mwz0KULrF4NlSplcPTGGGOShPKKoimwXlU3qOox4GMgZf31zsAHXnPp84HiIlIu2Eor7P3XFVBPmwYffghlyoQmemOMMUBoyygqAFsC3scAKdvvTm2eCsAJ9fVF5C7cFQfAUVm3bjk5rCZmGkoBOyMdRCZhx+I4OxbH2bE47rQ71whlokitqc+Uz2n6mQdVHQOMARCRBadbDT27sWNxnB2L4+xYHGfH4jgRWZD+XKkL5a2nGCCw8KAikLLxdD/zGGOMiaBQJorfgWgRqSIi+YAbgSkp5pkC3Oo9/XQRsE9VT24m0hhjTMSE7NaTqsaLSF/geyA38J6qrhCR3t700cC3wFXAeuAQ0MvHqseEKOSsyI7FcXYsjrNjcZwdi+NO+1hkuWbGjTHGhFfObQjJGGOML5YojDHGBJVpE0XImv/Ignwci+7eMVgqInNFpEEk4gyH9I5FwHxNRCRBRK4PZ3zh5OdYiEhLEVksIitE5KdwxxguPv5HionIVBFZ4h0LP+WhWY6IvCci20VkeRrTT++8qaqZ7oUr/P4TqArkA5YAtVPMcxXwHa4uxkXAr5GOO4LH4mKghDfcPicfi4D5ZuAelrg+0nFH8HtRHFgJnOO9PzvScUfwWDwKvOgNlwZ2A/kiHXsIjkUL4HxgeRrTT+u8mVmvKELS/EcWle6xUNW5qrrHezsfVx8lO/LzvQDoB3wBbA9ncGHm51h0A75U1c0Aqppdj4efY6FAERERoDAuUcSHN8zQU9WfcfuWltM6b2bWRJFW0x6nOk92cKr7eTvuF0N2lO6xEJEKwDXA6DDGFQl+vhfVgRIiMktEForIrWGLLrz8HIuRQC1chd5lwH2qmhie8DKV0zpvZtb+KDKs+Y9swPd+ikgrXKK4JKQRRY6fY/E68LCqJrgfj9mWn2ORB7gAaA1EAfNEZL6qrg11cGHm51i0AxYDlwPnAT+KyC+quj/EsWU2p3XezKyJwpr/OM7XfopIfWAs0F5Vd4UptnDzcywaAx97SaIUcJWIxKvqpLBEGD5+/0d2qmosECsiPwMNgOyWKPwci17AMHU36teLyF9ATeC38ISYaZzWeTOz3nqy5j+OS/dYiMg5wJfALdnw12KgdI+FqlZR1cqqWhn4HLg3GyYJ8Pc/Mhm4VETyiEhBXOvNq8IcZzj4ORabcVdWiEgZXEuqG8IaZeZwWufNTHlFoaFr/iPL8XksngBKAm95v6TjNRu2mOnzWOQIfo6Fqq4Skf8BS4FEYKyqpvrYZFbm83vxDDBeRJbhbr88rKrZrvlxEZkItARKiUgM8CSQF87svGlNeBhjjAkqs956MsYYk0lYojDGGBOUJQpjjDFBWaIwxhgTlCUKY4wxQVmiMCdIr/XJgPke81rhXOq1TnphBsfxrYgU94b7i8gqEflIRDoFazXWm3+u97eyiHTzub3/iMgT3vBQEfnb26/FIjIsyHJDRWSg7x1LfR2VReSwt62VIjJaRE7pf1NEGovIG95wSxG5OGBa74xoviPFcVkpIjf5WGaAV4cjvfk+FpHoM43RhIY9HmtOICItgIO4hsPqpjFPM+BVoKWqHhWRUriWOENSM15EVuNqnP91isu1BAaqagcf884FOqnqThEZChxU1eE+lvM9b5B1VAa+VtW6IpIH1/Lt66r65Wmu74xjSm+93kl9IVBSVeOCLLMRaJxenQURuQy4WVXvzMCQTQaxKwpzAh+tTwKUwzUNcdRbZmdSkhCRjSLyooj85r2qeeNLi8gXIvK792rujS8sIuNEZJl3dXJdwHpKichoXPPRU0TkfhHpKSIjvXnKiMhX4voYWJL0K1pEDnpxDsPVTF7sLfuLiDRM2gkRmSMi9UWkOnA02MlMRO704l7i7cdJv5K9K5+V3n587I0r5F2l/S4ii0QktdZuA49/PDAXqCYi54rIdG9908XVwEdEuojIci+Wn71xLUXkay/p9Abu9/b70qSrHhGpJSLJTVZ4VzJLveELROQncY0Hfi/ptCiqqutwFbZKeMu/LSILxF1lPpV0PIDywEwRmemNaysi80TkDxH5TEQKe6v8BbjCS5Qmswl3e+n2yvwvoDJptGfvTS+Ma2BtLfAWcFnAtI3AY97wrbhfygATgEu84XOAVd7wi7hfz0nLlwhYT6lUhnsCI73hT4AB3nBuoJg3fND72zJp+977HknbwrWsusAb7gW8EjDfUOBvbx8X4xqUKxkw/VmgX8C8A73hrUB+b7i49/d53C9lcP1DrAUKpXW8gYK4JinaA1OBHt7424BJ3vAyoEKK7STva2BMqcS4GKjqDT8MPI6ruTsXKO2NvwFXuznl5x64nvOBXwKmnRXwOcwC6qfy2ZUCfk7af2/7TwSs40fggkh//+118suuKMwpU9WDuFZJ7wJ2AJ+ISM+AWSYG/G3mDV8BjBSRxbj2ZoqKSBFv/KiAde/Bv8uBt73lElR1XzrzfwZ0EJG8uBPveG98OW8/Ar2mqg291/dAXe+KZBnQHaiTyvqXAh+JyM0c7+ugLTDY2+9ZQAFcokzpPG+eOcA3qvod7thN8Kb/l+OtAs/BNUdxJ+7EfCo+Bbp6wzfgkm0NoC6uRdXFuOSRVp8m94vIGuBXXOJI0lVE/gAW4Y5N7VSWvcgbP8fbTg/g3IDp23FXICaTscs8ky4RqYT7dQswWl07Qgm4E98s7+TZg+Mn3sCCr6ThXEAzVT2cYt1CmJqHV9VDIvIjrvOWrriWZgEOA8XSWXw88B9VXeIlxZapzHM1roexTsAQEamDa1foOlVdk876/1TVhuntgrcfvcU9PHA1sDjwdpoPnwCficiXblW6TkTqAStUtVk6y4JLoMNF5FrgAxE5D5doBwJNVHWPiIzHJcSUBPhRVdMqBC+A+yxMJmNXFCZdqrol4Nf1aBGpISc+odIQ2BTw/oaAv/O84R+AvkkzBJzcUo4vcQqhTQfu8ZbLLSJFU0w/ABRJMW4s8Abwu6omlcWsAqqls60iwDbvaqR7yoninlKqpKozgUG420yFcQ3V9fMSIiLSyN+uAe520I3ecHdgtreO81T1V1V9AtjJic1GQ+r7DYCq/gkkAENwSQNgDVBa3EMKiEheL8mlSV1B+wLcD4SiQCywT1zLrO3TiGU+0FyOl1sV9MqHklQHVgTbrokMSxTmBOJan5wH1BCRGBG5PZXZCgPvJxXc4m4nDA2Ynl9EfgXuA+73xvUHGnsFsytxBa7g7veXSCqcBVqdQrj3Aa28K5qFnHw7aCkQ7xX63g+gqguB/cC4gPl+BholnczTMAR3u+VHYHUq03MDH3qxLML98t6La7U0L7BU3CPHz5zC/vUHennH+BZvfwFeFlf4v9yLfUmK5aYC1yQVZqey3k+Am3G3oVDXfej1wIveZ7AY1w97ep4GHsCVmSzCneTfw90aSzIG+E5EZqrqDlwZ00Rvn+bj+oRIavr7sGbPrgKyPHs81mQo8fk4ZKSISHncLbOaGtAVpoiMAKaq6rRIxZaTeYl8v6q+G+lYzMnsisLkGOIqnf2KeyorZX/Jz+OeODKRsRd4P9JBmNTZFYUxxpig7IrCGGNMUJYojDHGBGWJwhhjTFCWKIwxxgRlicIYY0xQ/w8ZM7WaLOQO0AAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "## Plot ROC AUC Curve\n",
    "from sklearn.metrics import roc_auc_score,roc_curve\n",
    "plt.figure()\n",
    "\n",
    "# Add the models to the list that you want to view on the ROC plot\n",
    "auc_models = [\n",
    "{\n",
    "    'label': 'Random Forest Classifier',\n",
    "    'model': RandomForestClassifier(n_estimators=1000,min_samples_split=2,\n",
    "                                          max_features=7,max_depth=None),\n",
    "    'auc':  0.8325\n",
    "},\n",
    "    \n",
    "]\n",
    "# create loop through all model\n",
    "for algo in auc_models:\n",
    "    model = algo['model'] # select the model\n",
    "    model.fit(X_train, y_train) # train the model\n",
    "# Compute False postive rate, and True positive rate\n",
    "    fpr, tpr, thresholds = roc_curve(y_test, model.predict_proba(X_test)[:,1])\n",
    "# Calculate Area under the curve to display on the plot\n",
    "    plt.plot(fpr, tpr, label='%s ROC (area = %0.2f)' % (algo['label'], algo['auc']))\n",
    "# Custom settings for the plot \n",
    "plt.plot([0, 1], [0, 1],'r--')\n",
    "plt.xlim([0.0, 1.0])\n",
    "plt.ylim([0.0, 1.05])\n",
    "plt.xlabel('1-Specificity(False Positive Rate)')\n",
    "plt.ylabel('Sensitivity(True Positive Rate)')\n",
    "plt.title('Receiver Operating Characteristic')\n",
    "plt.legend(loc=\"lower right\")\n",
    "plt.savefig(\"auc.png\")\n",
    "plt.show() "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
